Introduction

The Qinghai-Xizang Plateau (QXP), recognized as the highest plateau globally and the largest within China, boasts an average elevation exceeding 4000 m, earning it the moniker “Roof of the World.” The environmental stressors present on the QXP, including low oxygen levels, extreme temperatures, and intense ultraviolet (UV) radiation, present significant challenges to the survival of indigenous species1. Numerous studies have substantiated the distinct molecular mechanisms underlying high-altitude (HA) adaptation across various species2,3,4,5. Similarly, convergent and parallel evolutionary adaptations associated with HA environments have been demonstrated across species with broader phylogenetic relationships. For instance, the EPAS1 gene plays a pivotal role in the HA adaptation mechanisms of various species, including the Tibetan Mastiff6, Tibetan wolf7, Tibetan pig8, Tibetan goat9, and Tibetan horse10. Parallel amino acid site substitutions in the RETSAT gene in four plateau animals result in the cardiac convergence phenotype required for adaptation to the hypoxic environment of the QXP11. Notably, convergent or parallel adaptive evolution appears to occur more frequently among species that are more closely phylogenetically related. The EPAS1 and FEN1 genes related to the hypoxia response and UV-radiation damage repair in three Thermophis species had the same amino acid site substitution12. Eight identical amino acid site substitutions have been observed in six genes related to HA adaptation in three Rhinopithecus species13. Exploring the mechanisms of HA adaptation in species contributes to our understanding of speciation and biodiversity, elucidates the relationship between genetic variation and environmental adaptation during the evolutionary process, and provides significant guidance for the prevention and treatment of HA illnesses. However, current research on the HA adaptive mechanisms of closely related species has primarily centered on amphibians, reptiles, birds, and primates, with relatively scant knowledge on the adaptive mechanisms of carnivorans at plateau environments12,13,14,15. Investigating the HA adaptation mechanisms of carnivorans contributes to a more comprehensive and profound understanding of the diverse strategies employed by different species in adapting to extreme environments.

Vulpes belong to the Carnivora and Canidae, and consist of 12 species; their distribution covers most parts of the world, except Antarctica16,17. Vulpes vulpes (V. vulpes) has the most extensive distribution range, including most parts of Asia, Europe, North America, and a small part of Africa and Australia18. A wider distribution range implies a strong ability to adapt to complex habitat types, and the selection pressure exerted by different habitat types promotes the phenotypic differentiation and environmental adaptation mechanisms of populations in different regions. Currently, V. vulpes has the most subspecies differentiation of Carnivora species, with a total of 47 subspecies, including Vulpes vulpes montana (V. v. montana)19, which is better adapted to the QXP environment compared with the others. Studies have shown that V. v. montana adapted to the QXP for a short period of time, only about 0.11 Mya20. Vulpes ferrilata (V. ferrilata) is another Vulpes species adapted to the QXP environment. Previous studies suggested that V. ferrilata adapted to the QXP earlier than V. v. montana, and the divergence time of V. vulpes and V. ferrilata coincided with the uplift of the QXP21. Therefore, V. v. montana and V. ferrilata provides an ideal model for investigating the similarities and differences in HA adaptive mechanisms among closely related carnivorans.

In this study, we conducted genome assembly on the V. v. montana at the chromosome level, and performed comparative genomics to explore the HA adaptation mechanism of Vulpes species. We aimed to reveal the similarities and differences in the HA adaptation mechanism of Vulpes and provide new evidence and perspectives for the HA adaptation mechanism of species. In addition, the high-quality of the genome also provides important resources and theoretical support to the study of the phylogeny, divergence time, and the correlation of geological events in Vulpes, which lays a solid foundation for further studies.

Results and discussion

Genome assembly and annotation

We conducted an in-depth analysis of the V. v. montana genome using NGS, long-read, and Hi-C sequencing technologies. The sizes of the clean reads obtained from NGS, long-read, and Hi-C sequencing were 82, 25, and 256 Gb, respectively, and meanwhile the results of the genome survey indicated that the estimated size of the V. v. montana genome was 2415 Mb. Based on the ratio of the clean reads to the estimated genome size, the total sequencing depth was calculated to be 250.1 X (Table 1). The genome k-mer depth, heterozygosity, and duplication rate were 27, 0.49, and 51.87%, respectively (Fig. S1 and Table S1). The polished assembly genome size, following the correction of NGS and long-read sequencing data, was ascertained to be 2.34 Gb, with a contig N50 length of 47.68 Mb (Table 2). Hi-C sequencing data were used to further improve the assembly quality of the genome, resulting in 185 contigs being assembled into 65 scaffolds, with a scaffold N50 length of 139.03 Mb (Table 2). Finally, the V. v. montana genome contained 17 scaffolds (2,335,090,060 bp) mounted on 17 chromosomes, with a total mounting rate of 99.51% (Table S2).

Table 1 Sequencing statistics generated by different platforms
Table 2 The information of V. v. montana genome based on third-generation sequencing

Sequence integrity and consistency of the V. v. montana genome was subsequently evaluated, the genome contained a total of 13,832 complete BUSCOs, accounting for 95.4% of the total BUSCOs (Table S3). Furthermore, Core Eukaryotic Genes Mapping Approach (CEGMA) analysis showed that the genome assembled 228 core genes, accounting for 91.94% coverage (Table S4). The results of both assessments indicated that the integrity of the polished genome met the requirements for subsequent analyses. NGS genome data of small fragment reads were aligned with the genome to determine the sequence consistency of the final version of the genome assembly. The genome alignment rate of V. v. montana was 99.24% and the homozygous SNP rate was 0.000132% (Table S5). This indicated a high consistency and accuracy of the genome. Finally, a Hi-C interaction heat map was generated to illustrate the chromosomal interactions based on the genomic scaffolds assembled through Hi-C analysis. The genome mounting of V. v. montana clearly distinguished 17 chromosomal groups (Fig. S2). This supported that the genome was successfully assembled at the chromosome level.

After annotation, the proportion of repeat sequences in the genome of V. v. montana was 43.65%, with a total length of 1,024,343,135 bp (Table S6). De novo prediction, homology prediction, and RNA-Seq prediction were used for EVM integration annotations for structural prediction of the V. v. montana genome. After integration, 21,551 genes were obtained with average gene and CDS lengths of 48,709.33 bp and 1,620.43 bp, respectively (Table S7). Additionally, in the results of comparisons with different databases, a total of 20,056 genes were annotated in the V. v. montana genome, accounting for 93.06% of its total gene set (Table S8 and Fig. S3).

The methods of BUSCO and comparison of annotation information of closely related species were used to evaluate the accuracy of the genome annotation results. A total of 8647 complete BUSCOs were found in V. v. montana, accounting for 93.7% (Table S9). V. v. montana had 21,551 genes, while the average number of genes in the other eight closely related species was 20,949. The average length of transcripts and CDS in V. v. montana was 48,709.33 bp and 1,620.43 bp, respectively, while the average length of the other eight closely related species was 40,124.19 and 1,616.31 bp, respectively (Table S10). Statistical analysis found that the distribution trend of V. v. montana was consistent with that of closely related species, indicating that the annotation results were reliable (Fig. S4).

Genome collinearity analysis

Circo's analysis was conducted on the pseudo-chromosomes of V. v. montana to further ascertain the quality of the genome assembly and annotation (Fig. 1A). The uniform distribution of gene density, GC content, and BAM coverage across the 17 pseudo-chromosomes, coupled with a relatively low frequency of N-base content, indicates that the genome assembly and annotation have achieved the necessary standards for subsequent analytical procedures. Additionally, this study explored the chromosomal collinearity of Vulpes species by integrating the genomes of three other existing Vulpes species. The results highly correspond with the phylogenetic relationships of four Vulpes species from previous research21,22,23, suggesting that the closer the phylogenetic relationships among Vulpes species, the higher the degree of chromosomal collinearity (Fig. 1B). The 17 chromosomes of V. v. montana exhibited complex correspondences with those of V. ferrilata, mirroring the pattern observed between V. lagopus and V. v. montana. Intriguingly, the X chromosome of V. lagopus was found to align with the 10th chromosome of V. v. montana and V. ferrilata, as well as the 0th chromosome of V. corsac. Among these four Vulpes species, only the sex chromosomes demonstrated a high degree of collinearity. This observation suggests that sex chromosomes have maintained a more conserved position throughout the evolutionary process24.

Fig. 1: Landscape of the V.
figure 1

v. montana genome and collinear analysis about four Vulpes species. A Landscape of V. v. montana genome. From outside to inside, assembled chromosomes (a); GC content on each chromosome (b); GC skew on each chromosome (c); N ratio on each chromosome (d); bam coverage on each chromosome (e); gene density on each chromosome (f). B Genome collinear analysis about V. lagopus, V. v. montana, V. ferrilata and V. corsac. The silhouette of the V. lagopus was sourced from PHYLOPIC under the Public Domain Mark 1.0 license. Silhouettes of the other three Vulpes species were photographed and created by the first author of this study.

Phylogenetic relationship and demographic history of Vulpes species

A total of 212,210 genes were identified during the identification of gene families for phylogenetic analysis across the ten species. Ultimately, 8741 single-copy orthologous genes were used to construct the phylogenetic relationships (Table S11). Gap-less sequences were used to determine the divergence times of the species (Fig. 2A). Our findings indicate that V. lagopus diverged from the other three relatives approximately 5.38 million years ago (Mya) (95% CI: 7.3–3.6 Mya). This divergence time is congruent with the fossil record of the species, supporting the hypothesis that V. lagopus originated in the QXP around 5 Mya25. The divergence time between V. ferrilata and V. corsac was estimated at 3.09 Mya (95% CI: 4.3–2.1 Mya); the common ancestor of V. ferrilata and V. corsac diverged from V. v. montana approximately 4.49 Mya (95% CI: 6.1–3.0 Mya). It is noteworthy that in this study, the divergence time between V. ferrilata and V. corsac is earlier than previously reported20. Based on the theory that the period from 3.6 to 3 Mya was characterized by significant uplift of the QXP, we hypothesize that the divergence of these two species may be intricately linked to this geological event26. The divergence times of the four Vulpes species obtained in this study resolve previous disputes over their divergence times and provide a more scientific and accurate reference for subsequent studies on a wider range of Vulpes species and even carnivorans divergence times.

Fig. 2: Phylogenetic relationship of the V.
figure 2

v. montana with other species and historical demography of two high-altitudeVulpes species. A Phylogenetic relationship of the V. v. montana with other species. The estimated divergence time (million years) and the 95% confidential intervals are labeled at each branch site (Confidence intervals are shown by red translucent squares). The bottom row of various colored squares represents different geological periods. B Estimated effective population size of the two high-altitude Vulpes species. Except for the silhouettes of V. ferrilata, V. corsac, and V. v. montana, which were photographed and created by the first author of this study, all other species silhouettes used in this research were sourced from PHYLOPIC under either the CC0 1.0 Universal Public Domain Dedication or Public Domain Mark 1.0 licenses.

To elucidate the demographic histories of the two HA Vulpes species (V. ferrilata and V. v. montana), we estimated the historical effective population sizes (Ne) of their ancestral lineages (Fig. 2B). As a widely distributed species, V. v. montana exhibited relatively stable Ne fluctuations, suggesting minimal impact from climatic changes. This result is also consistent with the species’ exceptional environmental adaptability27. Furthermore, the Ne of V. v. montana experienced a slight decline after 10,000 years, potentially attributable to increased human activity during the Neolithic period, similar to observations in Pavo muticus and Ailuropoda melanoleuca28,29. In contrast, V. ferrilata showed a sharp decrease in Ne during the Last Glacial Maximum, followed by a rapid recovery post-glacial period, indicating high environmental sensitivity and significant susceptibility to climatic fluctuations. Overall, these findings suggest that the population dynamics of these two HA Vulpes species may be governed by entirely distinct factors, highlighting the need for more targeted conservation strategies.

Gene family expansion and contraction of HA Vulpes species

A total of 9264 single-copy orthologous genes were identified for evolution analysis across the nine species. (Table S12). Gene families that have undergone significant expansion in V. ferrilata and V. v. montana were substantially enriched in 58 and 86 Gene Ontology (GO) terms, respectively (p < 0.01 and FDR < 0.01) (Supplementary Data 1 and 2, Fig. S5). Notably, the glutathione metabolic process (GO:0006749) and ATP hydrolysis activity (GO:0016887) were among the GO terms commonly significantly enriched in both species, suggesting that the significantly expanded gene families within these terms may be associated with the adaptation of Vulpes species to the hypoxic and cold conditions of HA environments. Exposure to a HA environment can lead to a marked increase in oxidative glutathione levels, which can be deleterious to the internal glutathione metabolism and the antioxidant defense mechanisms of the body. Glutathione metabolism is known to play an essential role in the defense against free radical and reactive oxygen species-induced damage30. Additionally, ATP hydrolysis, a process that is accompanied by the release of heat, is recognized for its key role in the body’s resistance to cold stress31.

GO enrichment results for the contracted gene families suggested that the most significantly enriched terms in both HA Vulpes species included various types of ion channel activity and transport (GO:0001518, 0005248, 0005261, 0006814, 0005216, and 0006811) (Supplementary Data 3 and 4, Fig. 3A, B). Ion channels play a key role in regulating cellular functions in various biological processes, including neuronal transmission, ventilation control, cardiac contractility, and control of vascular tension, and oxygen content has a significant impact on ion concentration and channels32,33,34. In addition to ion channels, the enrichment of olfactory receptor activity (GO:0004984) might be associated with the adaptation of V. ferrilata and V. v. montana to the HA environment. The most significant HA adaptive differences found in human populations are decreased smell and increased taste35. Similar conclusions were drawn in studies on yaks and other low-altitude cattle36. Therefore, gene families enriched in terms of ion channels, transport, and olfactory receptor activity could be related to the adaptation of V. ferrilata and V. v. montana to the low-oxygen environment of the plateau. This suggests that the low-oxygen environmental pressure of the QXP is a key factor in the contraction of gene families with similar functions in Vulpes species.

Fig. 3: Analysis of gene family contractions in two high-altitude Vulpes Species and enrichment of selective pressure of V.
figure 3

v. montana. A GO enrichment analysis of significantly contracted gene families in the V. v. montana. B GO enrichment analysis of significantly contracted gene families in the V. ferrilata. C Functional enrichment analysis of positively selected genes in the V. v. montana.

Positive selection genes of HA Vulpes species

After rigorous screening, 48 and 89 PSGs were retained for enrichment analysis in V. ferrilata and V. v. montana, respectively (Supplementary Data 5 and 6). Given the relatively small number of PSGs in V. ferrilata, no significant enrichment of GO terms was observed. Consequently, our analysis was concentrated on the PSGs of V. v. montana. The results indicated that the PSGs in V. v. montana might be associated with its adaptation to the QXP environment. For instance, PSGs enriched in DNA metabolic processes, damage response, and repair (GO:0006281, 2001022, 0051052, 0006259, and 0006284) may be related to the adaptation to DNA damage caused by high UV-radiation36,37. Genes enriched in immune-related terms (Interleukin-6 family signaling and Th17 cell differentiation pathways) may be associated with the immune damage due to hypoxia38,39. Additionally, genes enriched in the regulation of cold-induced thermogenesis (GO:0120161) may be related to the adaptation to low temperatures. We selected ten representative enriched terms for display (Fig. 3C), and the complete enrichment results can be found in the Supplementary Data 7. It is noteworthy that hypoxia response-related genes, FAM13A and MAPKAPK540,41, are identified as PSGs in both V. ferrilata and V. v. montana. FAM13A (114 R in V. ferrilata and 260E in V. v. montana) had different positive selection sites in the two Vulpes species, whereas MAPKAPK5 (132I and 133 A) had the same sites. Overall, the same PSGs related to the hypoxia response indicated that FAM13A and MAPKAPK5 play key roles in the hypoxia adaptation of Vulpes species, and the different PSGs indicated that there are unique HA adaptation mechanisms among different Vulpes species.

Parallel amino acid site substitution of HA Vulpes species

In V. ferrilata and V. v. montana, a total of 27 genes underwent parallel amino acid site substitutions (Table S13), and significant structural alterations were observed in the proteins of four immune response-associated genes (STAB242, KLK1343, DDX6044, and IGSF2245) (Fig. S6). The hypoxic environment of the QXP as a source of environmental stress can cause changes in the autonomic nervous system and endocrine function, and these changes will further have direct and long-term effects on the immune system46,47. Therefore, it is plausible that the protein structures of these four genes have undergone alterations associated with adaptation to hypoxia.

Employing a comprehensive screening strategy that encompassed the assessment of deleterious mutations, identification of conserved structural domains, and analysis of changes in protein hydrophilicity and hydrophobicity, we have identified the IGSF22 gene from among the four candidates for further functional validation (Fig. 4, Fig.S7, and Supplementary Data 8). Our findings reveal that a parallel amino acid substitution has occurred at position 132 (Ser to Phe, IGSF22S132F) of the IGSF22 gene in HA Vulpes (Fig. 4A). This substitution is situated within the immunoglobulin C-2 type (IGc2) domain of the IGSF22 gene, which spans from positions 79 to 147. The amino acid change at this site has consequently altered the hydrophilicity and hydrophobicity from amino acids 61 to 69 and 301 to 309, respectively, thereby exerting an impact on the protein structure of IGSF22 (Fig. 4B, C). The IGSF22 gene belongs to the immunoglobulin super family (IGSF). The number of immunoglobulins in the blood of patients with HA pulmonary edema is linearly related to the intensity of the plateau reaction48, and intense exercise on the plateau exacerbates hypoxia, thereby affecting the number of immunoglobulins in the body45. Additionally, previous studies have reported genetic differences in the IGSF10 and IGSF21 genes among goats at different altitudes9, indicating that this gene family may play a role in HA adaptation across different species. However, no research has yet revealed how IGSF22 mutations help species adapt to HA environments.

Fig. 4: Parallel amino acid site substitutions in the IGSF22 gene of two high-altitude Vulpes species.
figure 4

A Differential genomic loci of the IGSF22 gene between two high-altitude Vulpes species and their closely relatives. B The alterations in hydrophilicity and hydrophobicity of the protein encoded by the IGSF22S132F genotype. C Protein 3D structure prediction outcomes for IGSF22 gene variants.

In order to test the potential functional differences of the IGSF22 genotype (S132F) in HA Vulpes species compared with the low-altitude relatives, we assessed the effects of the S132F mutation in HEK293T cells under hypoxic conditions. Western blotting analysis revealed that the IGSF22S132F exhibited significantly higher expression levels compared to the wild-type (WT) under hypoxic conditions (P < 0.01, Fig. 5A; Figs. S8, S9 and Supplementary Data 9). Results from RT-qPCR assays demonstrate that the mRNA expression levels of the IGSF22S132F were slightly lower than the WT under normoxic conditions, with no significant difference in expression levels observed (Fig. 5B and Supplementary Data 10). However, under hypoxic conditions, the mRNA expression of IGSF22S132F is not only substantially higher than that of the WT but also exhibits significant (p < 0.05) and highly significant (p < 0.01) differences compared to the expression levels of IGSF22S132F and WT under normoxic conditions, respectively. We hypothesize that hypoxic conditions significantly enhance the transcriptional activity of IGSF22S132F. However, the protein expression levels did not increase proportionally (Fig. 5A), potentially due to constraints imposed by multiple factors, including transcriptional and post-transcriptional regulation, translation, post-translational modifications, and protein stability49. After a 24 h incubation period, IGSF22S132F demonstrated a significant enhancement (p < 0.05) in cellular viability compared to the WT under both hypoxic and normoxic conditions (Fig. 5C and Supplementary Data 11), suggesting that the mutant variant exerts a crucial stimulatory effect on cell activity that is independent of ambient oxygen levels. Contrary to the results obtained from cell viability assays, no significant differences in cell apoptosis were observed among different genotypes of IGSF22 under normoxic conditions. However, cells expressing the IGSF22S132F demonstrated a significantly higher survival rate under hypoxic conditions compared to the WT control (IGSF22S132F: 85.77%; WT: 72.16%). Specifically, the early and late apoptosis rates for IGSF22S132F were reduced by 5.9% and 6.89%, respectively (Fig. 5D). Ultimately, we employed RT-qPCR assays to elucidate whether the IGSF22S132F genotype modulates the transcriptional response of down-stream genes. Our findings demonstrate that cell lines expressing the IGSF22S132F genotype displayed a significant increase (p < 0.01 and p < 0.05) in mRNA expression of angiogenesis-associated genes (AKT150 and ABCG251) under both normoxic and hypoxic conditions compared to WT controls. Furthermore, the mRNA expression of another angiogenesis-related gene, VEGFA52, was significantly up-regulated (p < 0.01) exclusively under hypoxic conditions (Fig. 5E; Supplementary Data 10 and 12). Hypoxia inflicts damage on biological organisms, manifesting as inflammatory responses and cardiovascular system impairments. Concurrently, the augmented expression of genes associated with angiogenesis can precipitate a vascularization response, thereby enhancing vascular density and blood flow to accommodate the organism’s oxygen requirements53,54. Our results demonstrate that the S132F genotype not only significantly enhances the expression of this immune-related gene under hypoxic conditions but also markedly elevates the mRNA expression levels of multiple angiogenesis-related genes, potentially playing a crucial role in the hypoxia adaptation of HA Vulpes species.

Fig. 5: Functional assays of the IGSF22S132F genotype.
figure 5

A Western-blotting analysis of the HEK293T cell lysates transfected with the plasmid of WT, S132F, and empty vector. B The mRNA expression levels of the WT and IGSF22S132F measured by RT-qPCR. C Effect of hypoxic stimulus on the cell viability of HEK293T cells. D Effect of hypoxic stimulus on the apoptosis of HEK293T cells. E Gene expression of the hypoxia-related genes, including AKT1, ABCG2, and VEGFA in transfected HEK293T cells.

Conclusions

In summary, we conducted the first chromosome-level genome assembly of V. v. montana using Nanopore and Hi-C technologies. We re-calibrated the divergence time of Vulpes species by combining the three other chromosome-level Vulpes genomes. This result provides a reliable reference for subsequent studies on the divergence time of Vulpes species and Canids in the context of a larger phylogenetic relationship. Additionally, an examination of the population historical dynamics of V. ferrilata and V. v. montana has revealed markedly divergent evolutionary and expansion trajectories for these two HA Vulpes species. Comparative genomics showed that two Vulpes species co-distributed on the QXP have changes in the same gene families related to HA adaptation. Strict selection pressure analysis results showed that the PSGs of V. v. montana were related to environmental factors such as high UV-radiation, low oxygen, and extreme temperatures adapted to the QXP. Functional validation experiments demonstrate that parallel amino acid site substitution events occurring within the IGSF22 gene not only enhance the gene’s expression and transcription levels under hypoxic conditions but also significantly improve the hypoxia tolerance of this immune-related gene. Furthermore, the substitution appears to promote the expression of angiogenesis-associated genes under hypoxic conditions. Consequently, the amino acid site substitution in this gene may play a positive role in the operation of the immune and angiogenic responses in HA Vulpes species, aiding in their adaptation to the immunological damage and hypoxic injury caused by hypoxia in HA environments. This study provides a new perspective to explore the HA adaptation mechanisms of closely related species, and provides reliable genetic resources for further in-depth exploration of the origin and evolution of Vulpes species.

Methods

Ethic statement, sample collection, and high-throughput sequencing

An adult male V. v. montana was collected from Golmud City, Haixi Mongolian, and Tibetan Autonomous Prefecture, Qinghai Province, China. All sample collection and experiments were approved by the Qinghai Forestry and Grassland Bureau (No. 2021-7) and conformed to the guidelines established by the Ethics Committee for the Care and Use of Laboratory Animals of the Qufu Normal University (No. 2018-013). We have complied with all relevant ethical regulations for animal use. High-quality DNA extraction was performed on muscle tissue using the QIAamp DNA blood mini kit (QIAGEN, Germany). Total RNA from four tissues (i.e., Kidney, liver, lung, and stomach) was preserved and extracted by RNAlater (Thermo Fisher Scientific, USA) and TRIzol reagents (Tiangen Biotech, China), respectively.

This study utilized both next-generation sequencing (NGS) and Long-read sequencing technologies. Genome survey, Hi-C, and transcriptome sequencing were all performed using the whole-genome shotgun strategy with Illumina HiSeq 2500 and Illumina NovaSeq 6000 sequencing platforms (Illumina, USA). The data obtained from genome survey were used for genome size estimation, repeat content detection, heterozygosity estimation, and genome assembly correction. The data from Hi-C sequencing were employed to anchor genome sequences to the chromosome level, while transcriptome sequencing was primarily used for genome structural annotation. Long reads from the PromethION sequencing platform (Oxford Nanopore Technologies, UK) (insertion length: 20 kb) were used for genome assembly and genome correction. For NGS sequencing library preparation, qualified DNA samples are randomly fragmented using a Covaris ultrasonicator (Covaris, USA). The fragmented DNA then undergoes end-repair, dA-tailing, adapter ligation, purification, and PCR amplification to construct 350 bp libraries. For long-read sequencing platforms, upon passing quality control, large DNA fragments are size-selected and recovered via the BluePippin™ Automated DNA Size Selection System (Sage Science, USA). Subsequent damage repair is performed using the SQK-LSK109 Ligation Kit (Oxford Nanopore Technologies, UK), followed by end-repair and dA-tailed adapter ligation. Finally, the DNA library is precisely quantified using a Qubit fluorometer (Thermo Fisher Scientific, USA).

Raw sequencing data generated by the NGS platform were converted from BCL to FastQ format using Bcl2fastq v1.8.4 through base calling. Subsequently, Fastp v0.20.055 was applied to filter raw data, yielding high-quality clean reads. Clean data underwent quality control via FastQC (https://github.com/s-andrews/FastQC), with qualified data retained for subsequent analyses. A total of 100,000 quality-controlled clean reads (5000 reads from read1 and 5000 reads from read2) were randomly selected and aligned against the nucleotide sequence (NT) database using BLASTn (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The distribution of reads in the NT database and taxonomic classification of aligned species were statistically analyzed. Taxonomic assignments were determined based on the top six species ranked by alignment similarity. Homologous alignments were considered indicative of data accuracy without exogenous contamination.

Similar to NGS sequencing, raw data from Long-read sequencing were generated in FAST5 format and required conversion using Guppy v3.2.256 through base calling to obtain FastQ-format raw data. Subsequent filtering strategies were essentially consistent with those for NGS sequencing. Finally, Nanoplot v1.38.1 was used for data quality control.

Genome survey, assembly, and evaluation

Jellyfish v2.2.757 was used to process the quality-controlled NGS sequencing data to obtain k-mer information with varying frequencies. The steps included count (-G 2 -m 17 -C -o kmer count), histo (kmer count -o 17mer Freq), and stats (kmer count -o jelly. log). Subsequently, genome size and heterozygosity were evaluated using FindGSE v1.94.R58 and GenomeScope v2.059, respectively, to generate the final genome survey results.

NextDenovo v2.060 was used for genome assembly. Specifically, the NextCorrect module in NextDenovo was utilized to further correct the long-read clean data to obtain corrected consensus sequences (CNS), while the NextGraph module performs genome assembly based on the CNS sequences to generate a preliminary assembly. Racon v1.3.161 and NextPolish v1.0.560 were applied to correct the preliminary assembly using Nanopore and Illumina sequencing data, respectively, thereby producing the polished genome. Hi-C assembly involved the alignment of paired-end reads to the assembled genome sequence to obtain positional information based on Bowtie2 v2.3.262. Finally, LACHESIS63 was used to cluster, order, and orient the obtained contigs, resulting in chromosome-level genome sequences and statistical analyses of the chromosome scaffolding rates.

The integrity of the genome sequence was primarily evaluated using Benchmarking Universal Single-Copy Orthologs (BUSCO) and the CEGMA. BUSCO v4.0.564 was used to compare the genome with single-copy orthologs in the OrthoDB database to determine the completeness of the genome. Augustus v3.3.165 and HMMER v366 software were used for gene structure prediction and alignment to evaluate their integrity. CEGMA v267 aligned the core genes in the assembled genome with a known set of 248 conserved protein families to assess the accuracy and completeness of the assembled genome. BWA v0.7.1268, SAMtools v1.469, and BCFtools v1.8.070 were used to detect single-base variations. As for the genomic contamination assessment, Minimap271 was used to align clean reads from the Nanopore sequencing data to the assembled genome. SAMtools v1.4 calculates the sequencing depth at each genomic position based on the alignment results. Finally, a GC-depth scatter plot is generated with average GC content on the x-axis and average sequencing depth on the y-axis. Based on the principle that GC content differs across species, this analysis determines whether the genome contains foreign contamination. If no distinct GC separation is observed, it indicates the absence of exogenous contamination in the genome. If clear GC separation is detected, contaminants will be filtered in later stages using alignment results against the genome and the NT database.

Repeat annotation and gene structure prediction

Tandem Repeats (TRs) Finder v4.07b72 was used to search for TRs in the V. v. montana genome. RepeatModeler v1.0.1173, RepeatScout v1.0.674, and RECON v1.0875 were employed to identify interspersed repeats (IRs) in the genome through de novo prediction, while RepeatMasker v1.33173 detects IRs through homology-based annotation by aligning sequences against the RepBase (https://www.girinst.org/repbase/)76, its submodule RepeatProteinMask identifies transposable elements via alignment with the protein database (RepeatPeps.lib).

Gene structure prediction includes transcriptome, homology, and de novo predictions. De novo prediction was separately performed by Augustus v3.3.1, GlimmerHMM v3.0.477, and Geneid v1.4.578. Transcriptome prediction first used Fastp v0.23.1 to perform quality control on the raw sequencing data of five samples (kidney, liver, lung, muscle, and stomach) to obtain clean reads. Then the paired reads were assembled using SPAdes v3.15.379 followed by TransDecoder v5.5.080 to identify candidate coding regions in the transcript sequences, and CD-hit v4.8.181 was used to cluster the sequences to remove redundant sequences. The obtained protein sequences (together with other downloaded sequences from related species) (Table S14) were used for subsequent predictions. Spaln v2.4.682 was used to match the sequences of closely related species to the genome of V. v. montana for homology search. GeneWise v2.4.183 was used to precisely splice-matched sequences with homologous protein coding sequences. Finally, EvidenceModeler v1.1.184 was used to integrate the prediction results based on homology, RNA-Seq, and de novo prediction results to generate a non-redundant gene set. Genes with sequences shorter than 50 amino acids or with incomplete structures were filtered out.

Phylogeny, divergence time, and demographic history analyses

OrthoMCL v2.0.985 was utilized to identify gene families across species. Single-copy orthologous genes from 10 species were used to construct phylogenetic relationships, while nine species were used for subsequent selection pressure and amino acid site substitutions analysis. Specifically, to construct species-level phylogenetic relationships, this study excluded the red fox genome assembled using NGS sequencing and only retained species genomes at the chromosome level for phylogenetic relationship and divergence time analyses. Additionally, Homo sapiens and Mus musculus were included as outgroups for phylogenetic relationship analysis. Species details are provided in Supplementary Table S15. For constructing the phylogenetic tree, concatenation method based on maximum likelihood (ML) was adopted. MAFFT v7.31386 was used to perform multiple sequence alignment of the protein sequences of single-copy orthologous genes. Subsequently, PAL2NAL v1487 aligned coding sequences (CDS) based on the amino acid alignment results, and the CDS sequences were concatenated into a super matrix. Gblocks v0.91b88 was used for quality control. Finally, RAxML v8.2.1089 was employed to build the phylogenetic tree under the GTR + G (GTRGAMMA) nucleotide substitution model, with 100 bootstrap replicates to estimate node support values.

For estimating species divergence times, fourfold degenerate synonymous sites (4DTv) were first extracted from the super matrix using MEGA790. The baseml module in the PAML v4.891 was then used to calculate the nucleotide substitution rates of the 4DTv sequences. Finally, the MCMCTree module in the PAML v4.891 estimated the divergence times based on the Bayesian Markov Chain Monte Carlo algorithm. Two fossil-based calibration points and a divergence time from previous research were used as reference points for the divergence time estimation (Table S16), which were the divergence times of Caniformia and Feliformia (43–65 Mya)92,93,94, the divergence time of Canidae and Arctoidea (37–40 Mya)94,95,96, and the divergence time of Canis lupus and Vulpes (8.8–17.95 Mya)92,97.

We conducted demographic history analysis of V. v. montana and V. ferrilata using SMC + + v1.5.498. Compared to the conventional Pairwise Sequentially Markovian Coalescent analysis, SMC + + offers greater sample capacity and enhanced analytical precision. The neutral mutation rate per generation (μ) and generation time (g) for V. v. montana and V. ferrilata are 7.04 × 10−9 and 1 year99,100, respectively. Furthermore, we incorporated contemporaneous Marine Isotope Stage (MIS)101 and Pleistocene glacial epochs data as references for our analysis.

Gene family expansion and contraction

CAFÉ v4.2.1102 was used to analyze gene family expansion and contraction in V. v. montana and V. ferrilata. The gene family clustering file generated by OrthoMCL v2.0.9 served as the primary input dataset, with outlier gene families (defined as those containing ≥100 genes in any single species) being systematically excluded. An ultra-metric phylogenetic tree was concurrently employed as secondary input. The evolutionary trajectories of gene families across species were reconstructed through a birth-death model framework. Furthermore, to elucidate the functional significance of dynamically evolving gene families, GO enrichment analysis was conducted on significantly expanded or contracted gene families using the clusterProfiler package in R v3.6.2103. In this analysis, all single-copy orthologous genes from V. v. montana and V. ferrilata were utilized as the background gene set for enrichment, and only terms with statistically significant enrichment (p < 0.01 and FDR < 0.01) were retained.

Genes under positive selection

Selective pressure analysis used the Codeml module of PAML v4.891 to separately calculate the positively selected genes (PSGs) in V. v. montana and V. ferrilata at the branch-site and two-ratio branch models. Genes showing positive selection signals in both models were further screened. False positives were minimized by removing genes with positive selection signals at three or more consecutive sites, genes with positive selection sites at the beginning and end of the sequence, and genes with Bayes Empirical Bayes (BEB) posterior probabilities less than 0.9. The remaining PSGs were subjected to enrichment analysis using Metascape v3.5 (https://metascape.org)104 to explore the potential functions.

Amino acid site substitution analysis

The homologous protein sequences of the nine species were aligned to identify genes with identical amino acid site substitutions in V. v. montana and V. ferrilata (Table S15). The Codeml module in PAML v4.891 was used to reconstruct ancestral sequences of homologous genes containing at least 75% valid amino acid information. The Convergence at Conservative Sites method105 was employed to detect identical site substitutions. Genes with identical amino acid site substitutions in V. v. montana and V. ferrilata compared to those in the other seven species were retained and used for subsequent analysis. Amino acid sequences with substitution sites located in conserved structural domains were subjected to protein 3D structure prediction using I-TASSER. Finally, Chimera v1.5106 was used for protein structure comparisons to identify proteins with structural changes.

Cell maintenance, plasmid construction, and transfection

Human embryonic kidney 293 T (HEK-293T) cells from the Cell Bank of the Chinese Academy of Sciences were maintained in Dulbecco’s Modified Eagle Medium (DMEM), enriched with 1% antibiotic solution (penicillin and streptomycin) and 10% Fetal Bovine Serum. The sequence of IGSF22 was subjected to a mutation at position 132, resulting in the substitution of phenylalanine with serine (S132F). Subsequently, Site-Directed Mutagenesis Kit (D0206S, Beyotime) was utilized to introduce the serine mutation at the 132nd amino acid position, reverting the sequence to the WT.

In transfection experiments, HEK-293T cells were initially seeded into plates (07-6006, Biologix) at a density of 1 × 106 cells per well. Subsequently, plasmid DNA and transfection reagent mixtures were transfected into the cells using Lipofectamine 3000 (Invitrogen, USA) according to the manufacturer’s protocol. Cells transfected with WT, S132F, and pcDNA3.1- constructs were subjected to a 24 h incubation under conditions of 21 and 1% oxygen, respectively, to facilitate subsequent over-expression analyses, including Western blotting, quantitative polymerase chain reaction (qPCR), assessment of cellular apoptosis, and determination of cell viability.

Western blotting and qPCR quantification

Cells were lysed using RIPA lysis buffer (P0013B, Beyotime) supplemented with phenylmethanesulfonyl fluoride (PMSF; ST506, Beyotime). Total protein concentration was quantified using the BCA protein assay kit (C503021, Sangon). Western blotting analysis was performed on 10 µg of clarified cell lysates using anti-FLAG (1:8,000, E-AB-48025, Elabscience) and anti-GAPDH (1:1,000, AF2819, Beyotime). Subsequently, the membrane was incubated with a goat anti-mouse secondary antibody (1:8,000; Affinity; S0002). The antigen-antibody complexes were visualized using the ChemiDocTM XRS+ System (Bio-Rad, USA) with the SuperSignalTM West Pico PLUS chemiluminescent substrate (34577, ThermoFisher).

Cells were washed with Phosphate-Buffered Saline (PBS; C0221A, Beyotime) and digested with 0.25% Trypsin-EDTA solution (C0205, Beyotime). Total RNA was extracted using Super Fast Pure Cell RNA Isolation Kit (RC102, Vazyme) and reverse transcribed according to the manufacturer’s guidelines to synthesize complementary DNA (cDNA). The cDNA obtained was used as a template for qPCR experiments to evaluate gene expression level. Subsequently, quantitative real-time PCR analysis of ABCG2, AKT1, and VEGFA genes were performed using CFX ConnectTM Real-Time System (Bio-Rad, USA). Four replicates were set up per group, and any replicate with a Cq value exceeding the group’s average Cq value by more than 0.5 was considered an outlier and excluded. The sequences of PCR primers are detailed in the Supplementary Data 13.

Cell viability and apoptosis detection

The cytotoxicity of hypoxia on cells was evaluated using the Enhanced Cell Counting Kit-8 (C0042, Beyotime). Cells were seeded into a 96-well plate at a density of 1 × 104 cells per well with 100 μL of DMEM medium per well. Following transfection, the cells were cultured under normoxic and hypoxic conditions, respectively. After 24 h, 10 μL of the Cell Counting Kit-8 solution (CCK8, Beyotime) was added to each well and incubated at 37 °C for 1 h. Subsequently, the absorbance of each well at 450 and 600 nm was measured using a micro-plate reader (Spark, Switzerland).

The apoptotic rate was evaluated using the Annexin V-PriFluor 647/PI Apoptosis Detection Kit (P-CA-203, Procell). After washing the cells cultured under the two different conditions with cold PBS, cells were resuspended in 500 μL of Binding Buffer. Subsequently, 5 μL of Annexin V-PriFluor 647 and 5 μL of PI staining solution were added to the buffer, mixed well, and incubated in the dark at room temperature for 15 min. Finally, the apoptotic rate of cells at different time points was analyzed within 1 h using BD FACSVerseTM Cell Analyzer (BD Bioscience, USA). The specific gating strategy is detailed in Fig. S10 and S11.

Statistics and reproducibility

In GO enrichment analysis, the terms simultaneously satisfying both p < 0.01 and FDR < 0.01 were retained for subsequent analysis. In positive selection analysis, the PSGs that met p < 0.05 in both the branch-site model and the two-ratio branch model with BEB posterior probabilities exceeding 0.9 were retained for subsequent analysis. The specific information of statistical tests used in the study are provided in the respective methods sections. In the function validation section, Western blotting and qPCR quantification analysis employed three and four biological replicates, respectively, to ensure statistical accuracy. The cell viability analysis utilized a total of ten wells, with each well containing 1 × 104 cells. The cell apoptosis analysis was also performed using 1 × 104 cells.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.