Abstract
Intron gain and loss are rare events in vertebrates; however, comparative genome analysis of elephant sharks, tetrapods, and teleosts revealed a higher level of intron turnover in teleosts. slc26a1 and slc26a2 are members of the anion-exchanger gene family. Human, zebrafish, and Japanese pufferfish slc26a1 consist of two, two, and seven exons, respectively, and slc26a2, two, three, and four exons, respectively. To better understand intron turnover in teleosts, we analyzed the exon–intron organization of slc26a1 and slc26a2 in 81 vertebrates, including 62 ray-finned fish. In most Eurypterygii, which comprise the majority of the Neoteleostei and include Acanthomorpha, Aulopiformes, and Myctophiformes, slc26a1 and slc26a2 have seven and four exons, respectively, whereas those of most other ray-finned fishes consist of two and three exons, respectively, suggesting that intron gain occurred in both slc26a1 and slc26a2 of the Eurypterygii ancestor. In addition, notothenioid slc26a2 has six exons, suggesting that two introns were inserted into the notothenioid ancestor. The two newly acquired introns in the notothenioid consist of transposon-like sequences, suggesting that they were generated via transposon insertion. The positions of some of the newly acquired introns of slc26a1 and slc26a2 in Eurypterygii are identical or very close to those of other slc26 members. These results demonstrate the lineage-specific intron gains of slc26a1 and slc26a2 in ray-finned fish and convergence at the insertion sites of some of the newly acquired introns.
Similar content being viewed by others
Introduction
Spliceosomal introns, hereafter referred to as introns, are present in eukaryotic nuclear genes and contribute to diverse gene functions. The origin of introns in eukaryotes is ancient, and many introns present in the genomes of extant species originate from relatively old eukaryotic ancestors1,2. In early eukaryotes, introns may have been present as selfish elements, but they later gained many functions independently in different eukaryotic lineages, and a wide range of functions are thought to have been inherited by modern species1. Intron sequences are transcribed to RNA and are involved in the regulation of splicing, transcription rate, nuclear export, RNA stability, and alternative splicing1,3,4,5,6, as well as sources of non-coding RNAs such as microRNAs1,7. Introns also function as DNA molecules in the nuclear genome. The intron sequences in the genome regulate gene expression via transcriptional regulation as cis-regulatory elements1,4,6, control chromatin assembly4,8, enhance the efficiency of natural selection4,9, and serve as sources of new genes4,10,11. In eukaryotes, intron-rich and intron-poor species are interspersed, and there are conserved intron positions between widely diverged species2,12,13. These observations suggest that ancestral eukaryotes are intron-rich and that lineage-specific losses affect the exon–intron structure of genes in extant species2. Relatively new intron gains and losses are specific to each lineage. Mechanisms of intron loss include reverse transcriptase-mediated intron loss2,14,15 and genomic deletion2. The mechanisms of intron gain include intron transposition, transposon insertion, tandem genomic duplication, intron gain during double-strand break repair, insertion of a group II intron, intron transfer, and intronization2,16.
Several studies comparing eukaryotic genes have shown that specific intron locations (10–40%) are conserved among eukaryotes and that the number and placement of most introns are dynamic during evolution17. However, in vertebrates, analyses of the whole-genome sequences of vertebrate species have shown that intron turnover is low18. Genomic comparisons between the Japanese pufferfish and spotted green pufferfish indicated very low levels of intron turnover in these lineages19. Subsequent analysis using cartilaginous fish as an outgroup confirmed that intron turnover was low when comparing cartilaginous and tetrapod genome sequences, but comparison of cartilaginous and teleost genome sequences revealed high intron turnover in the teleost genome18,20. High intron turnover can potentially contribute to the phenotypic diversity of teleosts; however, it remains unclear why teleosts have a high intron turnover rate and whether high intron turnover contributes to phenotypic diversity18.
Vertebrate body fluids contain the major inorganic anions such as Cl−, HCO3−, phosphate, and sulphate. Anion homeostasis in body fluids is handled by anion channels and anion transporter families. The solute carrier 26 (Slc26) is one of the anion transporter families and comprises 11 members, Slc26a1-a11, in mammals21,22 (Note that, in this article, protein name abbreviations of all species are shown with the first letter capitalized, and gene names of all species are shown as lowercase and italicized). Recently, we identified a novel member of Slc26, Slc26a12, which is widely present in coelacanths, amphibians, various reptiles, and birds but not in cartilaginous fishes, ray-finned fishes, most turtles, some lineages of birds, and mammals23. In species that possess Slc26a12 gene (slc26a12), slc26a12 and slc26a2 are always tandemly present at the same locus on the same chromosome. Since slc26a12 is present in coelacanths, tetrapods that lack it may have secondarily lost slc26a12. Cartilaginous and ray-finned fishes lack slc26a12, and a jawless fish, inshore hagfish, possess a gene similar to slc26a12. Therefore, it is also possible that cartilaginous and ray-finned fishes also secondarily lost slc26a12, but no clear evidence has been provided23. As physiological functions, the Slc26 family are involved in sulfate transport24,25, bicarbonate secretion and Cl− absorption by the digestive tract26,27,28, Cl− reabsorption in the kidney29,30, Cl− secretion in the stomach29,30, oxalate efflux31, and auditory organ function32,33,34. Slc26 proteins share 12 transmembrane regions and a sulfate transporter anti-sigma factor antagonist (STAS) domain in the intracellular carboxy-terminal region21,22.
Slc26a1, also known as sulfate anion transporter 1 (Sat-1), is an Na+-independent sulfate transporter found on the basolateral membrane of intestinal and renal epithelial cells and hepatocytes that transports sulfate between body fluids and the cytoplasm in mammals35. Study of a patient presenting with painful perichondritis, hyposulfatemia, and renal sulfate wasting revealed a mutation in the human slc26a125. Slc26a2 was first isolated by positional cloning of diastrophic dysplasia and is also called diastrophic dysplasia sulfate transporter (Dtdst)24. Various slc26a2 mutations have been found in chondrodysplasia syndromes, and the analysis of mice expressing slc26a2 mutants has revealed skeletal abnormalities, decreased chondrocyte proliferative activity, and decreased sulfate absorption into chondrocytes21,36. Slc26a1 has also been isolated from rainbow trout, Japanese eel, and elephant sharks37,38,39. It is localized in the basolateral membrane of proximal tubules and exhibits sulfate transport activity when heterogeneously expressed in Xenopus laevis oocytes. In the Japanese eel, a euryhaline species, the kidney functions to retain sulfate during freshwater acclimation and to excrete sulfate during seawater acclimation. Slc26a1 is thought to contribute to both freshwater and seawater acclimation by increasing the sulfate permeability of the basolateral membrane of the proximal tubule39,40,41. When slc26a2 was knocked down in zebrafish, significant defects were observed in otolith patterns, semicircular canal morphology, and lateral neuromast distribution in morphants, indicating that this gene is important for auditory development. Expression of slc26a2 has also been observed in the proximal tubules of the pronephric duct in zebrafish embryos42.
The exon–intron structure of the slc26 genes is conserved within this subfamily. The protein-coding regions of the mammalian slc26a1 and slc26a2 genes are encoded by two exons24,43,44,45,46, and the tetrapod slc26a12 genes are encoded by two exons23. In contrast, the coding region of the other slc26 genes consists of approximately 20 exons47,48,49,50. In our previous study on the Slc26 family in pufferfish and zebrafish, we found diversity in the number of exons in the slc26a1 and slc26a2 genes. slc26a1 has three exons in zebrafish and seven exons in Japanese pufferfish, whereas slc26a2 has three exons in zebrafish and four exons in Japanese pufferfish. Therefore, slc26a1 and slc26a2 are good examples for studying intron turnover in vertebrates. In the present study, we analyzed the exon–intron structure of these genes using the genome databases of 81 vertebrate species, including 62 ray-finned fish. These results suggested that intron insertions occurred in both slc26a1 and slc26a2 in the ancestor of Eurypterygii. Further analysis revealed that slc26a2 in notothenioids has a six-exon structure because of the relatively recent acquisition of two introns and that the newly acquired intron is homologous to transposon-like sequences. These results provide a useful example for understanding high intron turnover in teleosts.
Methods
Identification of orthologs for slc26a1, slc26a2, and slc26a12
First, we collected amino acid and cDNA sequences of Slc26a1 and Slc26a2 from humans, western clawed frogs, zebrafish, Japanese pufferfish, three-spined sticklebacks, and Japanese medaka, and Slc26a12 from Western clawed frogs. Using these sequences as queries, BLASTp and tBLASTn analyses were performed against the protein and genome databases in the NCBI (https://blast.ncbi.nlm.nih.gov)51 and ENSEMBL (https://www.ensembl.org)52 databases of the sequences listed in Table 1 to collect the sequences expected to be slc26a1, slc26a2, and slc26a12. Some of these genes were manually annotated registered them as third-party annotations (TPA) to the DDBJ (BR002474–BR002489 and BR002490–BR002502). The collected amino acid sequences for Slc26a1, Slc26a2, and Slc26a12 were aligned using ClustalW software (https://www.genome.jp/tools-bin/clustalw)53 and a phylogenetic tree was constructed to confirm that the nomenclature was correct. Due to weak evidence of a direct orthologous relationship between these genes and related genes in jawless vertebrates, we designated them as Slc26a2-like and Slc26a12-like.
The evolutionary history was inferred by the Maximum Likelihood method and Le and Gascuel (LG) model54 using IQ-TREE55 (https://www.hiv.lanl.gov/content/sequence/IQTREE/iqtree.html). The percentages of trees in which the associated taxa were clustered together were generated from 1000 ultrafast bootstrap approximation56. The alignment had 167 sequences with 861 columns, 815 distinct patterns, 695 parsimony-informative, 81 singleton sites, and 85 constant sites. The resulting Newick format tree was visualized using MEGA1157.
Estimation of exon–intron organization of slc26a1, slc26a2, and slc26a12
Using the annotated amino acid sequences of Slc26a1, Slc26a2, and Slc26a12 and their coding regions in the cDNA sequences as queries, we performed tBLASTn and BLASTn analyses against the genome databases of the species listed in Table 1 in NCBI (https://blast.ncbi.nlm.nih.gov)58 and ENSEMBL (https://www.ensembl.org) to confirm the exons encoding the proteins and the introns that separate the exons (Supplementary Fig. S1). Introns were predicted according to the gt-ag rule, and the exon–intron organization of each gene was confirmed. For species whose genome analysis results are shown in the NCBI Genome Data Viewer, we referred to the RNA-seq exon coverage (aggregate, filtered) results to visually confirm whether the prediction of the exon–intron region was reasonable59 (Supplementary Figs. S2, S3). Some of the annotated amino acid sequences contained amino-terminal extensions that are not well conserved in other species. Exons encoding species-specific amino-terminal sequences were excluded from the analysis in this study, and only exons encoding amino acid sequences that are conserved among species were included in the analysis.
Comparison of the sites of intron insertion in slc26a1, slc26a2, and slc26a12
The amino acid sequences of Slc26a1 (human, European smelt, peladilla, and Atlantic cod), Slc26a2 (human, zebrafish, European smelt, Atlantic cod, big-belly seahorse, and emerald rockcod), Slc26a12 (chicken), Slc26a2-like (inshore hagfish and sea lamprey), Slc26a12-like (sea lamprey), and Slc26a3 (human) were aligned using ClustalW software. The intron insertion sites of each protein are marked. Introns were classified by checking where they were inserted before the first, second, and third bases of the codon and labeled as 0, 1, and 2, respectively. A schematic diagram was created from the alignment obtained (Supplementary Fig. S4).
Comparison sequences between introns and putative transposable elements
Using the sequences of the newly acquired introns 3 and 5 in the Notothenia slc26a2 as queries, BLASTn analysis was performed on the Notothenia genome database in NCBI (https://blast.ncbi.nlm.nih.gov)58 to identify and classify sites that are homologous to other loci. Sequences homologous to multiple loci were designated putative transposable elements (NTEs).
Synteny analyses
Representative slc26a1, slc26a2, and slc26a12 shown in Table 1 were selected for synteny analysis, and information on the neighboring genes of each gene was collected using NCBI genome Data viewer (https://www.ncbi.nlm.nih.gov/genome/gdv/)51 and ENSEMBL (https://www.ensembl.org)52, and the order and orientation of each gene were summarized in a figure.
Analyses of the ohnologs generated by the teleost-specific whole-genome duplication in ancestral teleosts
All teleosts examined harbored one slc26a1 and one slc26a2. In other words, we could not find any species that harbored both ohnologs derived from teleost-specific whole-genome duplications. To confirm whether teleost slc26a1 and slc26a2 obtained in this study were derived from one of the ohnologs generated by the teleost-specific whole-genome duplication or from both ohnologs, we performed a series of analyses. We selected genes that existed in loci close to slc26a1 and slc26a2 and that conserved an ohnolog in another locus. The amino acid sequences of these genes were collected, and phylogenetic trees were generated for each gene using ClustalW and IQ-TREE, as described above. The alignment had 11 sequences with 1314 columns, 372 distinct patterns, 182 parsimony-informative, 136 singleton sites, and 996 constant sites. We then assessed whether genes in loci close to slc26a1 and slc26a2 in the phylogenetic tree were clustered in one branch or distributed in two branches and determined the ohnologous relationship between extant slc26a1 and slc26a2 in teleosts.
Results
Exon–intron structures of slc26a1, slc26a2, and slc26a12 in tetrapods, lobe-finned fishes, cartilaginous fishes, and jawless fishes
Before describing the results for ray-finned fish, the exon–intron structures of slc26a1, slc26a2, and slc26a12 in tetrapods, lobe-finned fish, cartilaginous fishes, and jawless fishes were analyzed and compared (Figs. 1, 2). The validity of the exon–intron structure prediction was confirmed by comparing the sequence conservation using tBLASTn (Supplementary Fig. S1) and the RNA-seq exon coverage (aggregate, filtered) results shown in the NCBI Genome Data Viewer (Supplementary Figs. S2, S3)59. The correct name of each gene was confirmed by molecular phylogenetic analysis (Fig. 3). The species used in this study are listed in Table 1. Among tetrapods, mammals have slc26a1 and slc26a2, whereas birds, reptiles, and amphibians have slc26a1, slc26a2, and slc26a1223. In the examined placental mammals, marsupials, monotremes, birds, reptiles, and amphibians, the protein-coding regions of slc26a1 and slc26a2 consisted of two exons, and the positions of these introns were conserved (Figs. 1, 2). In amphibians, reptiles, and birds, the protein-coding region of slc26a12 has two exons, and the positions of the introns were conserved with those of tetrapod slc26a1 and slc26a2 (Figs. 1 and 2).
Exon–intron organization of slc26a1, slc26a2, and slc26a12 in vertebrates. Results for 28 species are shown. Exons are indicated by filled-in colored boxes and numbers, and introns are indicated by white vertical bars (right). Divergence times of species were retrieved from the TimeTree database (http://www.timetree.org/)62 and shown on the left. The accession number of each sequence is summarized in Table 1.
Comparison of intron positions among jawed vertebrate slc26a1, slc26a2, and slc26a12, related genes of lamprey had hagfish, and human slc26a3. (A) Schematic representation of the domain structure of human Slc26a1 protein. Transmembrane domains and the STAS (Sulfate Transporter and Anti-Sigma factor antagonist) domain are indicated by black and gray boxes, respectively. (B) Position of intron insertion sites in comparison with Slc26a1 domain structure shown in (A). Horizontal bars indicate polypeptide of each protein. Boxes indicate the site of intron insertion. The numbers indicate the position of intron insertion within each codon.
Phylogenetic analysis of Slc26a1, Slc26a2, and Slc26a12 in vertebrates. The amino acid sequences of Slc26a1, Slc26a2, and Slc26a12 in jawed vertebrates were aligned with Slc26a2-like and Slc26a12-like in jawless fishes using ClustalW software and a phylogenetic tree was constructed by the maximum-likelihood method using IQ-TREE55. Numbers indicate bootstrap values. The accession numbers of the amino-acid sequences used in this study are listed in Table 1.
Lobe-finned fish, such as coelacanths, have slc26a1, slc26a2, and slc26a12, all of which are encoded by two protein-coding exons, and the positions of the introns were conserved with those of the tetrapod orthologs (Fig. 1). Cartilaginous fish have both slc26a1 and slc26a2. In the holocephalans, sharks, and rays examined, the protein-coding regions of slc26a1 and slc26a2 contained two exons, and the positions of the introns were conserved with those of the orthologs in tetrapods and lobe-finned fish (Fig. 1).
The results for jawless fish differed from those for the vertebrate species described above. Hagfish have slc26a2-like and slc26a12-like genes, whereas lamprey has slc26a2-like23. In hagfish and lamprey, the protein-coding region of slc26a2 consists of six exons (Fig. 1). Four of the five introns in slc26a2-like were conserved between hagfish and lamprey, whereas intron 2 of hagfish slc26a2-like and intron 4 of lamprey slc26a2-like were present at unique positions in each gene (Figs. 1 and 2). In hagfish, slc26a12-like is encoded by four exons, and the sites of two of the three introns were conserved with those of slc26a2-like in hagfish and lamprey (Figs. 1 and 2). The positions of intron 4 of hagfish slc26a2-like, intron 3 of lamprey slc26a2-like, and intron 1 of hagfish slc26a12-like were conserved with those of intron 1 of slc26a1, slc26a2, and slc26a12 in tetrapods, lobe-finned fish, and cartilaginous fish (Fig. 2).
Exon–intron structures of slc26a1 and slc26a2 in basal ray-finned fishes and teleosts in Eloposteoglossocephala, Ostariophysi, Argentiniformes, Esociformes, and Salmoniformes
Since ray-finned fish have slc26a1 and slc26a2 but not slc26a12, we analyzed the exon–intron structures of the protein-coding regions of these two genes in ray-finned fish species. Here, the results of basal ray-fined fishes and teleosts other than Eurypterygii, Galaxiiformes, Osmeriformes, and Stomiiformes are described.
In basal ray-finned fishes, such as polypterids, sturgeonids, and gariforms, the protein-coding regions of slc26a1 and slc26a2 had two and three exons, respectively (Fig. 1). The positions of intron 1 of slc26a1 and slc26a2 in these species were conserved with those of intron 1 of slc26a1 and slc26a2 in cartilaginous fish, lobe-finned fish, and tetrapods. Intron 2 of slc26a2 in basal ray-finned fish was present at a unique position and was conserved between slc26a2 of basal ray-finned fish.
In 15 teleost species in 13 orders/suborders/families (Eloposteoglossocephala including Osteoglossiformes, Elopiformes, Albuliformes, Anguilliformes; Ostariophysi including Clupeiformes, Gonorynchiformes, Cypriniformes, Gymnotiformes, Characiformes, Siluriformes; and Argentiniformes, Esociformes, and Salmoniformes), slc26a1 and slc26a2 had two and three exons, respectively, and their intron positions were conserved with those of the basal ray-finned fish (Fig. 4). These results suggest that the protein-coding regions of slc26a1 and slc26a2 in the common ancestral species of ray-finned fish had two and three exons, respectively, and that slc26a2 acquired intron 2 in the common ancestor ray-finned fish.
Exon–intron organization of slc26a1 and slc26a2 in spotted gar and teleosts. Results from 59 teleost species from 47 orders/suborders/families are presented and compared with those of the spotted gar, a basal ray-finned fish that is not a teleost. Exons are indicated by filled colored boxes and numbered, and introns are indicated by white vertical bars (right). Divergence times of species were retrieved from the TimeTree database (http://www.timetree.org/)62 and shown on the left. The tree topology between Osteoglossiformes (Asian bonytongue), Elopiformes (tarpon), Albuliformes (West African bonefish), and Anguilliformes (European eel and European conger) was drawn based on the recent study by Parey et al.90. The tree topology between Galaxiiformes (peladilla) and Eurypterygii was drawn based on the recent study by Lavoué et al.60 and Near et al.61. The accession number of each sequence is summarized in Table 1.
Intron gain of slc26a1 in Eurypterygii and Galaxiiformes
The preliminary analysis showed that the protein-coding region of slc26a1 had seven exons in species such as cod, medaka, tilapia, stickleback, and pufferfish. Therefore, we performed a detailed analysis of slc26a1 in 39 species of 30 orders/suborders/families in Eurypterygii (Aulopiformes; Myctophiformes; and Acanthomorpha including Gadiformes, Lampriformes, Holocentriformes, Scombriformes, Callionymiformes, Syngnathiformes, Gobiiformes, Ambassidae, Pomacentridae, Blenniiformes, Cichliformes, Atheriniformes, Beloniformes, Aplocheloidei, Cyprinodontoidei, Anabantiformes, Synbranchiformes, Carangiformes, Centropomidae, Pleuronectiformes, Labriformes, Centrarchiformes, Moronidae, Sciaenidae, Spariformes, Tetraodontiformes, Gasterosteiformes, and Notothenioidei). The protein-coding region of slc26a1 in these 39 species contains seven exons (Fig. 4). The position of intron 2 of slc26a1 in Eurypterygii species was conserved with that of intron 1 of slc26a1 in cartilaginous fish, tetrapods, lobe-finned fish, basal ray-finned fish, and teleost species other than Eurypterygii, as described in the chapter above (Figs. 1, 2, and 4). The positions of introns 1, 3, 4, 5, and 6 of slc26a1 were unique and conserved among the Eurypterygii (Figs. 2 and 4).
The protein-coding region of slc26a1 in peladilla (Aplochiton taeniatus, Galaxiiformes), which is relatively close to Eurypterygii based on the evolutionary analyses by Lavoué et al.60 and Near et al.61, consists of four exons (Figs. 2 and 4). The positions of introns 1, 2, and 3 of peladilla slc26a1 were conserved with those of introns 1, 2, and 5 of Eurypterygii slc26a1, respectively (Figs. 2 and 4). These results suggest that intron gain occurred in the common ancestor of Eurypterygii and Galaxiiformes and that subsequent intron turnover, such as intron gain or loss, may have occurred in either or both of the ancestral species of Eurypterygii and Galaxiiformes. The estimated divergence date between the lineages is 139–169 million years ago (Mya)62.
Intron gain of slc26a1 in Osmeriformes and Stomiiformes
The protein-coding regions of slc26a1 in three species belonging to Stomiiformes and Osmeriformes, which are relatively close to Eurypterygii and peladilla, consist of three exons. The position of intron 1 of slc26a1 in these species was conserved with that of intron 1 of slc26a1 in basal ray-finned fishes and teleosts other than Eurypterygii, and intron 2 of slc26a1 in Eurypterygii. In contrast, intron 2 of these three species was inserted at 23 bp from the position of intron 3 in slc26a1 of Eurypterygii (Fig. 2).
Intron gain of slc26a2 in Eurypterygii, Galaxiiformes, Osmeriformes, and Stomiiformes
As the preliminary analysis showed that the protein-coding region of slc26a2 consists of four exons in species such as cod, medaka, stickleback, and pufferfish, we performed a detailed analysis of slc26a2 in 39 species of 30 orders/suborders/families in Eurypterygii. In species other than Syngnathiformes and Notothenioidei, the protein-coding regions of slc26a2 consisted of four exons (Fig. 4). The positions of introns 1 and 3 of slc26a2 in Eurypterygii, other than Syngnathiformes and Notothenioides, were conserved with those of introns 1 and 2 of slc26a2 in basal ray-finned fish and most teleosts other than Eurypterygii (Figs. 2 and 4). The protein-coding regions of slc26a2 of the peladilla (Galaxiiformes), which is relatively close to Eurypterygii based on the evolutionary analyses by Lavoué et al.60 and Near et al.61, also had a four-exon structure similar to that of Eurypterygii (Fig. 4), suggesting that the intron 2 was acquired in the common ancestor of Eurypterygii and Galaxiiformes.
Three species belonging to Stomiiformes and Osmeriformes also contained four exons in the protein coding region of slc26a2. However, the intron 2 of these three species was positioned 7 bp away from intron 2 of slc26a2 in Eurypterygii (Fig. 2, Supplementary Fig. S4).
In some species belonging to Syngnathiformes and Notothenioidei, the protein-coding regions of slc26a2 consisted of three and six exons, respectively (Fig. 4). Since Syngnathiformes and Notothenioidei belong to Eurypterygii, it was hypothesized that slc26a2 in these lineages caused intron loss and gain, respectively, as described below.
Intron loss of slc26a2 in seahorses and some pipefishes
Analysis of slc26a2 in the genome database of Syngnathiformes, pipefishes and seahorses, showed that the slc26a2 of the straightnose pipefish (Nerophis ophidion) consisted of four exons, like most other Eurypterygii, but that of three other species, such as the broad-nosed pipefish (Syngnathus typhle), common seadragon (Phyllopteryx taeniolatus), and big-belly seahorse (Hippocampus abdominalis), consisted of three exons (Figs. 4 and 5A). The positions of the two introns of slc26a2 in these species were conserved with those of introns 2 and 3 of the slc26a2 in straightnose pipefish and most other Eurypterygii, suggesting that an intron was lost in the common ancestor of broad-nosed pipefish common seadragon, and big-belly seahorse (Fig. 5A).
Timing of recent intron turnovers of slc26a2 in Syngnathiformes and Notothenioidei and two scenarios for the intron turnovers of slc26a1 and slc26a2 in Euteleostei. (A) Timing of recent intron loss of slc26a2 in Syngnathiformes. (B) Timing of recent intron gain of slc26a2 in Notothenioidei. In (A) and (B), arrows indicate timing of recent intron turnovers of slc26a in each lineage. Divergence times of species were retrieved from the TimeTree database (http://www.timetree.org/)62 and shown on the left. (C) Two scenarios for the intron turnovers of slc26a1 and slc26a2 in Euteleostei. Upper panel indicates a scenario for the intron turnovers based on the evolutionary analyses by Lavoué et al.60 and Near et al.61, and lower panel indicates that based on the study by Betancur-R et al.65.
Intron gain of slc26a2 in notothenioids
Seven species belonging to the genus Notothenioidei were analyzed. According to Bista et al.63, these seven species can be classified into six groups. The phylogenetic relationships of the Notothenioidei species are shown in Fig. 4B. The slc26a2 of the five notothenioid species (Emerald rockcod Trematomus bernacchii and marbled rockcod Notothenia rossii in Nototheniidae, blackfin icefish Chaenocephalus aceratus in Channichthyidae, Antarctic spiny plunderfish Harpagifer antarcticus in Harpagiferidae, and white-fin plunderfish Pogonophryne albipinna in Artedidraconidae) had a common six-exon structure, whereas slc26a2 of the other notothenioid species (Tristan klipfish Bovichtus diacanthus in Bovichtidae and Patagonian blennie Eleginops maclovinus in Eleginopidae) had a four-exon structure similar to non-notothenioid species in Eurypterygii (Figs. 4 and 5B). The positions of introns 1, 2, and 4 of slc26a2 in the five notothenioids were conserved with those of introns 1, 2, and 3 of slc26a2 in the other Eurypterygii, respectively (Fig. 2). This result suggests that introns 3 and 5 of slc26a2 in the five notothenioids were newly acquired in their common ancestor (Fig. 5B). The estimated divergence date between these five notothenioids and the Patagonian blennie was 17–38 Mya64 (Fig. 5B).
Alternative scenarios for the intron turnovers of slc26a1 and slc26a2 in Euteleostei
Euteleostei is a group of ray-finned fish consisting of Neoteleostei (including Eurypterygii), Esociformes, Salmoniformes, Argentiniformes, Stomiiformes, Osmeriformes, Galaxiiformes, and so on60,61,65. There are some hypotheses regarding the evolutionary history of Euteleostei. As aforementioned and shown in Fig. 4, Lavoué et al. 60 and Near et al. 61 also showed that the above species in Euteleostei consists of three clades: 1) Esociformes, Salmoniformes, and Argentiniformes; 2) Stomiiformes and Osmeriformes; and 3) Galaxiiformes and Neoteleostei (including Eurypterygii). In contrast, Betancur-R et al.65 showed that the above species in Euteleostei consist of three clades: 1) Esociformes, Salmoniformes, Argentiniformes, and Galaxiiformes, 2) Stomiiformes and Osmeriformes, and 3) Neoteleostei (including Eurypterygii). The scenarios for the intron turnover of slc26a1 and slc26a2 in Euteleostei based on these two hypotheses are summarized in Fig. 5C. slc26a1 and slc26a2 have new introns conserved between Galaxiiformes and Eurypterygii. Based on the evolutionary tree by Lavoué et al.60 and Near et al.61 that places Galaxiiformes close to Neoteleostei, the intron turnover scenario is simple because there is a good agreement between the similarity of the exon–intron structures of slc26a1 and slc26a2 and the clade composition of the species. (Fig. 5C, upper panel). However, based on the evolutionary tree of Betancur-R65 which places Galaxiiformes close to Esociformes, Salmoniformes and Argentiniformes, the intron turnover scenario is more complex because of the discrepancy between the exon and intron structure of slc26a1 and slc26a2 and the clade organization of the species (Fig. 5C, lower panel).
Origin of newly inserted introns of slc26a2 in notothenioids
Introns 3 and 5 of slc26a2 in the five notothenioids were assumed to have been acquired relatively recently, as aforementioned and shown in Fig. 5B, indicating that little time has passed since intron insertion, and that they may remain in sequences similar to those at the time of intron insertion. Introns 3 and 5 of slc26a2 in the five notothenioids showed high sequence similarity, indicating that they were homologous. In contrast, the introns of slc26a1 and slc26a2 acquired in the common Eurypterygii ancestor, that is, introns 1, 3, 4, 5, and 6 of slc26a1 and intron 3 of slc26a2 in most Eurypterygii species, showed little sequence homology among species, probably because of the accumulation of many neutral substitutions during the approximately 139–169 million years since their divergence after intron insertion. Therefore, the intron 3 and 5 sequences of slc26a2 in the five notothenioid species were considered useful for analyzing the mechanism of intron gain.
A BLAST analysis of the whole genome sequences of Nototenia species using intron sequences as queries revealed that sequences similar to parts of the intron were interspersed throughout the genome, and suggest that these introns consisted of putative transposon sequences (Fig. 6). We tentatively refer to these sequences as notothenioid putative transposable elements (NTEs). The 5′ and 3′ end of the new intron 3 of slc26a2 in the five notothenioids were highly homologous to NTE-1, and the central region of intron 3 was homologous to NTE-2 (Fig. 6B). NTE-2 contained a 37-nt repeat at the center.
Schematic representation of the primary structure of recently acquired introns in notothenioid slc26a2 and transposable element-like sequence. (A) Length and exon–intron organization of slc26a2 in a notothenioid emerald rockcod. Exons and introns are indicated by black boxes and horizontal bars, respectively. (B) Schematic representation of the sequence of the newly acquired intron 3 in notothenioid. (C) Schematic representation of the sequence of the newly acquired intron 5 in notothenioid. NTE, notothenioid putative transposable element. (D–K) Insertions of notothenioid putative transposable elements NTE-1 s to multiple loci of the notothenioid genomes. Insertion of NTE-1 to slc26a2 (D) and other loci (E–K) is shown. Accession numbers and the regions of indicated sequences are listed at the beginning of each line. Double slash indicates shortening sequence. Gaps are indicated by dashes in the sequences. Putative direct and inverted repeats are shown in red and blue, respectively. Protein coding and noncoding sequences are indicated by upper- and lower-case letters, respectively. gt-ag of intron 3 in slc26a2 are shown by gray boxes. Tbe Trematomus bernacchii, Nro Notothenia rossii, Cac Chaenocephalus aceratus, Han Harpagifer antarcticus, Pal Pogonophryne albipinna, Ema Eleginops maclovinus, Bdi Bovichtus diacanthus.
The insertion of NTE-1 s into multiple loci of the notothenioid genome is shown in Fig. 6D–K. NTE-1 s had a putative inverted repeat sequence beginning at tgt and ending at aca, flanked by 4 nt putative direct repeat sequences (Fig. 6D–K). Importantly, gt-ag of intron 3 was present in the putative direct and inverted repeats (Fig. 6D). These results suggest a history of intron 3 generation: insertion of NTE-1 into the exon of slc26a2 formed a new intron, and subsequent insertion of NTE-2 elongated the new intron. The central region of the new intron 5 of slc26a2 in the five notothenioids were highly homologous to NTE-3, and the 3′ end was homologous to NTE-4 (Fig. 6C). NTE-3 also contained a 37-nt repeat; however, the sequence was not homologous to that of NTE-2.
The extant teleost slc26a1 and slc26a2 genes are derived from one of the ohnologs resulting from the teleost-specific whole-genome duplication (TGD)
Teleosts often have ohnologs derived from the TGD; however, all teleost species analyzed had one slc26a1 and slc26a2 each. No species with two ohnologs of these genes were identified. This indicated that one of the two ohnologs, once acquired from the ancestral teleost species, was deleted during evolution. Simultaneously, slc26a1 and slc26a2 in extant teleosts can be derived from one of the two previously acquired ohnologs.
Therefore, to clarify whether slc26a1 and slc26a2 of extant teleosts originated from a single ohnolog or were derived from two ohnologs, we performed synteny analysis and analyzed the composition of ohnologs in neighboring genes (Supplementary Fig. S5). The tnks1 gene was commonly found near slc26a1 (Supplementary Fig. S5A), and other ohnologs of tnks1 were found at other loci in teleosts. Molecular phylogenetic analysis tnks1 classified them as tnks1a or tnks1b (Supplementary Fig. S5B). All teleost species examined had slc26a1 in close proximity to tnks1a but not to tnks1b (Supplementary Fig. S5). In the phylogenetic tree shown in Fig. 4, no branches indicated the presence of the two slc26a1 ohnologs. These results indicated that all teleost slc26a1 examined in this study originated from one of the two ohnologs derived from the TGD. In the case of slc26a2, we could not find neighboring genes with conserved ohnologs at other loci. However, the phylogenetic tree of teleost Slc26a2 did not show any branches indicating the presence of two slc26a2 ohnologs. Therefore, it is highly likely that all teleost slc26a2 examined in this study originated from one of the two ohnologs derived from the TGD.
Discussion
In this study, we demonstrated the presence of intron turnover of slc26a1 and slc26a2 in ray-finned fish and determined the timing of intron turnover. In cartilaginous fish, lobe-finned fish, and tetrapods, slc26a1 and slc26a2 share a common two-exon structure (Fig. 1), and some conservation of synteny exist between their loci23. These results suggest that slc26a1 and slc26a2 are ohnologs of the 2R whole-genome duplication occurring in the ancestral species of jawed vertebrates and that the two-exon structure is the primitive structure of slc26a1 and slc26a2 in jawed vertebrates23. In addition, these analyses indicated that cartilaginous fishes, lobe-finned fish, and tetrapods have conserved two-exon structures of slc26a1 and slc26a2. In contrast to species in these lineages, in ray-finned fish, exon–intron structures are heterogeneous, and intron turnover has been observed at certain times during their evolution. These results are consistent with previous reports that indicate high intron turnover in ray-finned fishes18,20, and provide a good sample for understanding intron turnover in ray-finned fishes.
In ray-finned fish, intron turnover occurred in slc26a1 and slc26a2 in the common ancestor of Eurypterygii, increasing by five and one intron(s), respectively (Fig. 4). In addition, further intron turnover was observed in slc26a2; an ancestral species of some Syngnathiformes lost one intron, and an ancestral species of Notothenioidei gained two introns (Fig. 5). These observations showed that intron turnover occurred at particular times and did not occur broadly or frequently in a variety of species of ray-finned fish. The reasons for this include the following: acquisition of these introns may have been advantageous for survival, or some bottleneck or founder effect may have influenced the timing of intron turnover. Currently, the benefits arising from the acquisition of these introns are unclear. Intron gain can lead to diversity in alternative splicing. Several reports of alternative splicing exist in other members with approximately 20 exons, such as slc26a6 and slc26a766,67. Future analysis of Eurypterygii species transcripts is expected to reveal whether the intron gain plays a role in generating new splicing isoforms. Synteny analyses suggested that extant slc26a1 and slc26a2 are orthologs of one of two ohnologs generated by the teleost-specific whole-genome duplication, and no paralogs for slc26a1 and slc26a2 have been found in ray-finned fishes. Therefore, the intron turnover observed in this study occurred in the orthologs of slc26a1 and slc26a2 in ray-finned fish.
The intron 3 and 5 of the five Notothenia species were recently acquired from their common ancestor. These introns consisted of putative transposable elements, indicating that they were generated by transposon insertion. Genome analyses of Notothenia species indicated a two-fold change in genome size due to the expansion of the transposable element family63,68. Therefore, the intron gain in Notothenia slc26a2 is likely to be explained as part of a genome-wide change. Transposon insertion is one of the major mechanisms of intron acquisition2,16. Expansion of the transposable element family in Notothenia species is expected to be an excellent target for understanding the mechanism of intron acquisition via transposon insertion.
Some of the newly acquired introns in slc26a1 and slc26a2 of Eurypterygii were inserted at the same positions as the introns of other slc26 family members, such as slc26a3 or jawless fish slc26a2. A comparison of the intron insertion sites is presented in Fig. 2. Several new introns were inserted at the same or very close to the introns of the other slc26 members, such as slc26a3, which does not belong to slc26a1/slc26a2/slc26a12 subfamily and consists of 19 exons and 18 introns (Fig. 2). All genes had an intron corresponding to intron 1 of slc26a1 and slc26a2 in tetrapods and cartilaginous fishes, suggesting that this intron is very old and was present before the separation of jawed and jawless vertebrates. The insertion sites of intron 1 of Eurypterygii (e.g., Atlantic cod) and Galaxiiformes (peladilla) slc26a1 matched those of introns 3 and 2 of slc26a1-like in hagfish and lamprey, respectively. The insertion sites of intron 5 of Eurypterygii slc26a1 and intron 3 of Galaxiiformes slc26a1 matched intron 9 of slc26a3. The position of intron 2 of basal ray-finned fish slc26a2 and its corresponding introns, intron 3 of slc26a2 in Osmeriformes (e.g., European smelt), intron 3 of slc26a2 in Eurypterygii other than notothenioid, and intron 4 of slc26a2 in the five notothenioids (e.g., emerald rockcod), matched the site of intron 10 of slc26a3. The positions of intron 2 of slc26a12-like in hagfish and intron 5 of slc26a2-like in hagfish and lamprey matched those of intron 8 in slc26a3. Therefore, the slc26 family genes convergently acquired introns at the same sites. The mechanism and functional significance of convergent intron acquisition in slc26 genes are novel questions that should be addressed in future research.
Conclusion
Intron gain and loss are rare events in vertebrates; however, high levels of intron turnover have been observed in teleosts. slc26a1 and slc26a2 are members of the anion exchanger gene family, and their protein-coding regions are encoded by two exons in cartilaginous fish and tetrapods. In the present study, a comparative analysis of the genomes of 62 ray-finned fish species showed that new intron insertions and deletions in these genes occurred at different times in certain fish species. These results provide a useful example for understanding the high levels of intron turnover in teleosts.
Data availability
The data underlying this article are available in the International Nucleotide Sequence Database Collaboration (INSDC), consisting of DDBJ, EMBL Bank and GenBank, under the accession numbers indicated in Materials and Methods.
References
Chorev, M. & Carmel, L. The function of introns. Front. Genet. 3, 55. https://doi.org/10.3389/fgene.2012.00055 (2012).
Roy, S. W. & Gilbert, W. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat. Rev. Genet. 7, 211–221. https://doi.org/10.1038/nrg1807 (2006).
Chen, M. & Manley, J. L. Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat. Rev. Mol. Cell Biol. 10, 741–754. https://doi.org/10.1038/nrm2777 (2009).
Jo, B. S. & Choi, S. S. Introns: The Functional Benefits of Introns in Genomes. Genomics Inform. 13, 112–118. https://doi.org/10.5808/GI.2015.13.4.112 (2015).
Matlin, A. J., Clark, F. & Smith, C. W. Understanding alternative splicing: towards a cellular code. Nat. Rev. Mol. Cell Biol. 6, 386–398. https://doi.org/10.1038/nrm1645 (2005).
Shaul, O. How introns enhance gene expression. Int. J. Biochem. Cell Biol. 91, 145–155. https://doi.org/10.1016/j.biocel.2017.06.016 (2017).
Rearick, D. et al. Critical association of ncRNA with introns. Nucleic Acids Res. 39, 2357–2366. https://doi.org/10.1093/nar/gkq1080 (2011).
Schwartz, S., Meshorer, E. & Ast, G. Chromatin organization marks exon-intron structure. Nat. Struct. Mol. Biol. 16, 990–995. https://doi.org/10.1038/nsmb.1659 (2009).
Otto, S. P. & Barton, N. H. The evolution of recombination: removing the limits to natural selection. Genetics 147, 879–906. https://doi.org/10.1093/genetics/147.2.879 (1997).
Carvunis, A. R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374. https://doi.org/10.1038/nature11184 (2012).
Grandchamp, A., Berk, K., Dohmen, E. & Bornberg-Bauer, E. New genomic signals underlying the emergence of human proto-genes. Genes (Basel) https://doi.org/10.3390/genes13020284 (2022).
Fedorov, A., Merican, A. F. & Gilbert, W. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc. Natl. Acad. Sci. U. S. A. 99, 16128–16133. https://doi.org/10.1073/pnas.242624899 (2002).
Rogozin, I. B., Wolf, Y. I., Sorokin, A. V., Mirkin, B. G. & Koonin, E. V. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr. Biol. 13, 1512–1517. https://doi.org/10.1016/s0960-9822(03)00558-x (2003).
Roy, S. W. & Penny, D. Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol. Biol. Evol. 24, 171–181. https://doi.org/10.1093/molbev/msl159 (2007).
van Schendel, R. & Tijsterman, M. Microhomology-mediated intron loss during metazoan evolution. Genome Biol. Evol. 5, 1212–1219. https://doi.org/10.1093/gbe/evt088 (2013).
Yenerall, P. & Zhou, L. Identifying the mechanisms of intron gain: progress and trends. Biol. Direct 7, 29. https://doi.org/10.1186/1745-6150-7-29 (2012).
Jeffares, D. C., Mourier, T. & Penny, D. The biology of intron gain and loss. Trends Genet. 22, 16–22. https://doi.org/10.1016/j.tig.2005.10.006 (2006).
Ravi, V. & Venkatesh, B. The divergent genomes of teleosts. Annu. Rev. Anim. Biosci. 6, 47–68. https://doi.org/10.1146/annurev-animal-030117-014821 (2018).
Loh, Y. H., Brenner, S. & Venkatesh, B. Investigation of loss and gain of introns in the compact genomes of pufferfishes (Fugu and Tetraodon). Mol. Biol. Evol. 25, 526–535. https://doi.org/10.1093/molbev/msm278 (2008).
Venkatesh, B. et al. Elephant shark genome provides unique insights into gnathostome evolution. Nature 505, 174–179. https://doi.org/10.1038/nature12826 (2014).
Alper, S. L. & Sharma, A. K. The SLC26 gene family of anion transporters and channels. Mol. Aspects Med. 34, 494–515. https://doi.org/10.1016/j.mam.2012.07.009 (2013).
Mount, D. B. & Romero, M. F. The SLC26 gene family of multifunctional anion exchangers. Pflugers Arch. 447, 710–721. https://doi.org/10.1007/s00424-003-1090-3 (2004).
Nagashima, A., Torii, K., Ota, C. & Kato, A. slc26a12-A novel member of the slc26 family, is located in tandem with slc26a2 in coelacanths, amphibians, reptiles, and birds. Physiol. Rep. 12, e16089. https://doi.org/10.14814/phy2.16089 (2024).
Hastbacka, J. et al. The diastrophic dysplasia gene encodes a novel sulfate transporter: positional cloning by fine-structure linkage disequilibrium mapping. Cell 78, 1073–1087. https://doi.org/10.1016/0092-8674(94)90281-x (1994).
Pfau, A. et al. SLC26A1 is a major determinant of sulfate homeostasis in humans. J. Clin. Invest. https://doi.org/10.1172/JCI161849 (2023).
Lohi, H. et al. Mapping of five new putative anion transporter genes in human and characterization of SLC26A6, a candidate gene for pancreatic anion exchanger. Genomics 70, 102–112. https://doi.org/10.1006/geno.2000.6355 (2000).
Schweinfest, C. W., Henderson, K. W., Suster, S., Kondoh, N. & Papas, T. S. Identification of a colon mucosa gene that is down-regulated in colon adenomas and adenocarcinomas. Proc. Natl. Acad. Sci. U. S. A. 90, 4166–4170. https://doi.org/10.1073/pnas.90.9.4166 (1993).
Schweinfest, C. W. et al. slc26a3 (dra)-deficient mice display chloride-losing diarrhea, enhanced colonic proliferation, and distinct up-regulation of ion transporters in the colon. J. Biol. Chem. 281, 37962–37971. https://doi.org/10.1074/jbc.M607527200 (2006).
Knauf, F. et al. Identification of a chloride-formate exchanger expressed on the brush border membrane of renal proximal tubule cells. Proc. Natl. Acad. Sci. U. S. A. 98, 9425–9430. https://doi.org/10.1073/pnas.141241098 (2001).
Wang, Z. et al. Renal and intestinal transport defects in Slc26a6-null mice. Am. J. Physiol. Cell Physiol. 288, C957-965. https://doi.org/10.1152/ajpcell.00505.2004 (2005).
Jiang, Z. et al. Calcium oxalate urolithiasis in mice lacking anion transporter Slc26a6. Nat. Genet. 38, 474–478. https://doi.org/10.1038/ng1762 (2006).
Coyle, B. et al. Pendred syndrome (goitre and sensorineural hearing loss) maps to chromosome 7 in the region containing the nonsyndromic deafness gene DFNB4. Nat. Genet. 12, 421–423. https://doi.org/10.1038/ng0496-421 (1996).
Sheffield, V. C. et al. Pendred syndrome maps to chromosome 7q21-34 and is caused by an intrinsic defect in thyroid iodine organification. Nat. Genet. 12, 424–426. https://doi.org/10.1038/ng0496-424 (1996).
Zheng, J. et al. Prestin is the motor protein of cochlear outer hair cells. Nature 405, 149–155. https://doi.org/10.1038/35012009 (2000).
Bissig, M., Hagenbuch, B., Stieger, B., Koller, T. & Meier, P. J. Functional expression cloning of the canalicular sulfate transport system of rat hepatocytes. J. Biol. Chem. 269, 3017–3021 (1994).
Jackson, G. C. et al. Pseudoachondroplasia and multiple epiphyseal dysplasia: a 7-year comprehensive analysis of the known disease genes identify novel and recurrent mutations and provides an accurate assessment of their relative contribution. Hum. Mutat. 33, 144–157. https://doi.org/10.1002/humu.21611 (2012).
Hasegawa, K. et al. Sulfate transporters involved in sulfate secretion in the kidney are localized in the renal proximal tubule II of the elephant fish (Callorhinchus milii). Am. J. Physiol. Regul. Integr. Comp. Physiol. 311, R66-78. https://doi.org/10.1152/ajpregu.00477.2015 (2016).
Katoh, F. et al. Cloning of rainbow trout SLC26A1: involvement in renal sulfate secretion. Am. J. Physiol. Regul. Integr. Comp. Physiol. 290, R1468-1478. https://doi.org/10.1152/ajpregu.00482.2005 (2006).
Nakada, T. et al. Roles of Slc13a1 and Slc26a1 sulfate transporters of eel kidney in sulfate homeostasis and osmoregulation in freshwater. Am. J. Physiol. Regul. Integr. Comp. Physiol. 289, R575–R585. https://doi.org/10.1152/ajpregu.00725.2004 (2005).
Kato, A. & Watanabe, T. Renal sulfate regulation. in Eel Physiology (eds F. Trischitta, Y. Takei & P. Sébert) 249-281 (CRC Press, 2016).
Watanabe, T. & Takei, Y. Molecular physiology and functional morphology of SO42- excretion by the kidney of seawater-adapted eels. J. Exp. Biol. 214, 1783–1790. https://doi.org/10.1242/jeb.051789 (2011).
Liu, F. et al. Solute carrier family 26 member a2 (slc26a2) regulates otic development and hair cell survival in zebrafish. PLoS ONE 10, e0136832. https://doi.org/10.1371/journal.pone.0136832 (2015).
Lee, A., Beck, L. & Markovich, D. The mouse sulfate anion transporter gene Sat1 (Slc26a1): cloning, tissue distribution, gene structure, functional characterization, and transcriptional regulation thyroid hormone. DNA Cell Biol. 22, 19–31. https://doi.org/10.1089/104454903321112460 (2003).
Lee, A., Dawson, P. A. & Markovich, D. NaSi-1 and Sat-1: structure, function and transcriptional regulation of two genes encoding renal proximal tubular sulfate transporters. Int. J. Biochem. Cell Biol. 37, 1350–1356. https://doi.org/10.1016/j.biocel.2005.02.013 (2005).
Regeer, R. R., Lee, A. & Markovich, D. Characterization of the human sulfate anion transporter (hsat-1) protein and gene (SAT1; SLC26A1). DNA Cell Biol. 22, 107–117. https://doi.org/10.1089/104454903321515913 (2003).
Rossi, A. & Superti-Furga, A. Mutations in the diastrophic dysplasia sulfate transporter (DTDST) gene (SLC26A2): 22 novel mutations, mutation review, associated skeletal phenotypes, and diagnostic relevance. Hum. Mutat. 17, 159–171. https://doi.org/10.1002/humu.1 (2001).
Lohi, H. et al. Functional characterization of three novel tissue-specific anion exchangers SLC26A7, -A8, and -A9. J. Biol. Chem. 277, 14246–14254. https://doi.org/10.1074/jbc.M111802200 (2002).
Vincourt, J. B., Jullien, D., Amalric, F. & Girard, J. P. Molecular and functional characterization of SLC26A11, a sodium-independent sulfate transporter from high endothelial venules. FASEB J. 17, 890–892. https://doi.org/10.1096/fj.02-0787fje (2003).
Waldegger, S. et al. Cloning and characterization of SLC26A6, a novel member of the solute carrier 26 gene family. Genomics 72, 43–50. https://doi.org/10.1006/geno.2000.6445 (2001).
Wedenoja, S. et al. Update on SLC26A3 mutations in congenital chloride diarrhea. Hum. Mutat. 32, 715–722. https://doi.org/10.1002/humu.21498 (2011).
Rangwala, S. H. et al. Accessing NCBI data using the NCBI sequence viewer and genome data viewer (GDV). Genome Res. 31, 159–169. https://doi.org/10.1101/gr.266932.120 (2021).
Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941. https://doi.org/10.1093/nar/gkac958 (2023).
Chenna, R. et al. Multiple sequence alignment with the clustal series of programs. Nucleic Acids Res. 31, 3497–3500. https://doi.org/10.1093/nar/gkg500 (2003).
Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320. https://doi.org/10.1093/molbev/msn067 (2008).
Trifinopoulos, J., Nguyen, L. T., von Haeseler, A. & Minh, B. Q. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res. 44, W232-235. https://doi.org/10.1093/nar/gkw256 (2016).
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522. https://doi.org/10.1093/molbev/msx281 (2018).
Tamura, K., Stecher, G. & Kumar, S. MEGA11: Molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 38, 3022–3027. https://doi.org/10.1093/molbev/msab120 (2021).
Johnson, M. et al. NCBI BLAST: a better web interface. Nucleic Acids Res. 36, W5-9. https://doi.org/10.1093/nar/gkn201 (2008).
Morales, J. et al. A joint NCBI and EMBL-EBI transcript set for clinical genomics and research. Nature 604, 310–315. https://doi.org/10.1038/s41586-022-04558-8 (2022).
Lavoue, S., Miya, M., Poulsen, J. Y., Moller, P. R. & Nishida, M. Monophyly, phylogenetic position and inter-familial relationships of the Alepocephaliformes (Teleostei) based on whole mitogenome sequences. Mol. Phylogenet. Evol. 47, 1111–1121. https://doi.org/10.1016/j.ympev.2007.12.002 (2008).
Near, T. J. et al. Resolution of ray-finned fish phylogeny and timing of diversification. Proc. Natl. Acad. Sci. U. S. A. 109, 13698–13703. https://doi.org/10.1073/pnas.1206625109 (2012).
Kumar, S. et al. TimeTree 5: An expanded resource for species divergence times. Mol. Biol. Evol. https://doi.org/10.1093/molbev/msac174 (2022).
Bista, I. et al. Genomics of cold adaptations in the Antarctic notothenioid fish radiation. Nat. Commun. 14, 3412. https://doi.org/10.1038/s41467-023-38567-6 (2023).
Near, T. J. Estimating divergence times of notothenioid fishes using a fossil-calibrated molecular clock. Antarct. Sci. 16, 37–44 (2004).
Betancur, R. R. et al. Phylogenetic classification of bony fishes. BMC Evol. Biol. 17, 162. https://doi.org/10.1186/s12862-017-0958-3 (2017).
Vincourt, J. B., Jullien, D., Kossida, S., Amalric, F. & Girard, J. P. Molecular cloning of SLC26A7, a novel member of the SLC26 sulfate/anion transporter family, from high endothelial venules and kidney. Genomics 79, 249–256. https://doi.org/10.1006/geno.2002.6689 (2002).
Lohi, H. et al. Isoforms of SLC26A6 mediate anion transport and have functional PDZ interaction domains. Am. J. Physiol. Cell Physiol. 284, C769-779. https://doi.org/10.1152/ajpcell.00270.2002 (2003).
Daane, J. M. & Detrich, H. W. 3rd. Adaptations and diversity of Antarctic fishes: A genomic perspective. Annu. Rev. Anim. Biosci. 10, 39–62. https://doi.org/10.1146/annurev-animal-081221-064325 (2022).
Yu, D. et al. Hagfish genome elucidates vertebrate whole-genome duplication events and their evolutionary consequences. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-023-02299-z (2024).
Smith, J. J. et al. The sea lamprey germline genome provides insights into programmed genome rearrangement and vertebrate evolution. Nat. Genet. 50, 270–277. https://doi.org/10.1038/s41588-017-0036-1 (2018).
Marletaz, F. et al. The little skate genome and the evolutionary emergence of wing-like fins. Nature 616, 495–503. https://doi.org/10.1038/s41586-023-05868-1 (2023).
Mayeur, H. et al. The sensory shark: high-quality morphological, genomic and transcriptomic data for the small-spotted catshark Scyliorhinus canicula reveal the molecular bases of sensory organ evolution in jawed vertebrates. bioRxiv https://doi.org/10.1101/2024.05.23.595469 (2024).
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921. https://doi.org/10.1038/35057062 (2001).
Wang, C. et al. A novel canine reference genome resolves genomic architecture and uncovers transcript complexity. Commun. Biol. 4, 185. https://doi.org/10.1038/s42003-021-01698-x (2021).
Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482. https://doi.org/10.1038/nature10530 (2011).
Palkopoulou, E. et al. A comprehensive genomic history of extinct and living elephants. Proc. Natl. Acad. Sci. U. S. A. 115, E2566–E2574. https://doi.org/10.1073/pnas.1720554115 (2018).
Mikkelsen, T. S. et al. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature 447, 167–177. https://doi.org/10.1038/nature05805 (2007).
Zhou, Y. et al. Platypus and echidna genomes reveal mammalian biology and evolution. Nature 592, 756–762. https://doi.org/10.1038/s41586-020-03039-0 (2021).
Smith, J. et al. Fourth report on chicken genes and chromosomes 2022. Cytogenet. Genome Res. 162, 405–528. https://doi.org/10.1159/000529376 (2022).
Rice, E. S. et al. Improved genome assembly of American alligator genome reveals conserved architecture of estrogen signaling. Genome Res. 27, 686–696. https://doi.org/10.1101/gr.213595.116 (2017).
Shaffer, H. B. et al. The western painted turtle genome, a model for the evolution of extreme physiological adaptations in a slowly evolving lineage. Genome Biol. 14, R28. https://doi.org/10.1186/gb-2013-14-3-r28 (2013).
Alfoldi, J. et al. The genome of the green anole lizard and a comparative analysis with birds and mammals. Nature 477, 587–591. https://doi.org/10.1038/nature10390 (2011).
Hellsten, U. et al. The genome of the Western clawed frog Xenopus tropicalis. Science 328, 633–636. https://doi.org/10.1126/science.1183670 (2010).
Rhie, A. et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737–746. https://doi.org/10.1038/s41586-021-03451-0 (2021).
Wang, K. et al. African lungfish genome sheds light on the vertebrate water-to-land transition. Cell https://doi.org/10.1016/j.cell.2021.01.047 (2021).
Amemiya, C. T. et al. The African coelacanth genome provides insights into tetrapod evolution. Nature 496, 311–316. https://doi.org/10.1038/nature12027 (2013).
Bi, X. et al. Tracing the genetic footprints of vertebrate landing in non-teleost ray-finned fishes. Cell https://doi.org/10.1016/j.cell.2021.01.046 (2021).
Du, K. et al. The sterlet sturgeon genome sequence and the mechanisms of segmental rediploidization. Nat. Ecol. Evol. 4, 841–852. https://doi.org/10.1038/s41559-020-1166-x (2020).
Braasch, I. et al. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nat. Genet. 48, 427–437. https://doi.org/10.1038/ng.3526 (2016).
Parey, E. et al. Genome structures resolve the early diversification of teleost fishes. Science 379, 572–575. https://doi.org/10.1126/science.abq4257 (2023).
Bian, C. et al. The Asian Arowana (Scleropages formosus) genome provides new insights into the evolution of an early lineage of teleosts. Sci. Rep. 6, 24501. https://doi.org/10.1038/srep24501 (2016).
Martinez Barrio, A. et al. The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing. Elife https://doi.org/10.7554/eLife.12081 (2016).
Klymus, K. E., Hrabik, R. A., Thompson, N. L. & Cornman, R. S. Genome resequencing clarifies phylogeny and reveals patterns of selection in the toxicogenomics model Pimephales promelas. PeerJ 10, e13954. https://doi.org/10.7717/peerj.13954 (2022).
Howe, K. et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496, 498–503. https://doi.org/10.1038/nature12111 (2013).
Warren, W. C. et al. A chromosome-level genome of Astyanax mexicanus surface fish for comparing population-specific genetic differences contributing to trait evolution. Nat. Commun. 12, 1447. https://doi.org/10.1038/s41467-021-21733-z (2021).
Gallant, J. R. et al. Nonhuman genetics. Genomic basis for the convergent evolution of electric organs. Science 344, 1522–1525. https://doi.org/10.1126/science.1254432 (2014).
Waldbieser, G. C. et al. Reference genomes of channel catfish and blue catfish reveal multiple pericentric chromosome inversions. BMC Biol. 21, 67. https://doi.org/10.1186/s12915-023-01556-8 (2023).
Rondeau, E. B. et al. The genome and linkage map of the northern pike (Esox lucius): conserved synteny revealed between the salmonid sister group and the Neoteleostei. PLoS ONE 9, e102089. https://doi.org/10.1371/journal.pone.0102089 (2014).
Gao, G. et al. A long reads-based de-novo assembly of the genome of the Arlee homozygous line reveals chromosomal rearrangements in rainbow trout. G3 (Bethesda) https://doi.org/10.1093/g3journal/jkab052 (2021).
Nakamoto, M. & Sakamoto, T. Improvement of the Ayu (Plecoglossus altivelis) draft genome using Hi-C sequencing. BMC Res. Notes 16, 92. https://doi.org/10.1186/s13104-023-06362-7 (2023).
Malmstrom, M., Matschiner, M., Torresen, O. K., Jakobsen, K. S. & Jentoft, S. Whole genome sequencing data and de novo draft assemblies for 66 teleost species. Sci. Data 4, 160132. https://doi.org/10.1038/sdata.2016.132 (2017).
Star, B. et al. The genome sequence of Atlantic cod reveals a unique immune system. Nature 477, 207–210. https://doi.org/10.1038/nature10342 (2011).
Bo, J. et al. Opah (Lampris megalopsis) genome sheds light on the evolution of aquatic endothermy. Zool. Res. 43, 26–29. https://doi.org/10.24272/j.issn.2095-8137.2021.183 (2022).
Bian, C. et al. Genomics comparisons of three chromosome-level mudskipper genome assemblies reveal molecular clues for water-to-land evolution and adaptation. J. Adv. Res. https://doi.org/10.1016/j.jare.2023.05.005 (2023).
Stervander, M. & Cresko, W. A. A highly contiguous nuclear genome assembly of the mandarinfish Synchiropus splendidus (Syngnathiformes: Callionymidae). G3 (Bethesda) https://doi.org/10.1093/g3journal/jkab306 (2021).
Roth, O. et al. Evolution of male pregnancy associated with remodeling of canonical vertebrate immunity in seahorses and pipefishes. Proc. Natl. Acad. Sci. U. S. A. 117, 9431–9439. https://doi.org/10.1073/pnas.1916251117 (2020).
Small, C. M. et al. Leafy and weedy seadragon genomes connect genic and repetitive DNA features to the extravagant biology of syngnathid fishes. Proc. Natl. Acad. Sci. U. S. A. 119, e2119602119. https://doi.org/10.1073/pnas.2119602119 (2022).
He, L. et al. Genome and gene evolution of seahorse species revealed by the chromosome-level genome of Hippocampus abdominalis. Mol. Ecol. Resour. 22, 1465–1477. https://doi.org/10.1111/1755-0998.13541 (2022).
Zhao, X. et al. Chromosome-scale assembly of the Monopterus genome. Gigascience https://doi.org/10.1093/gigascience/giy046 (2018).
Ou, M. et al. Chromosome-level genome assemblies of Channa argus and Channa maculata and comparative analysis of their temperature adaptability. Gigascience https://doi.org/10.1093/gigascience/giab070 (2021).
Hashiguchi, Y. et al. Draft genome of Akame (Lates Japonicus) reveals possible genetic mechanisms for long-term persistence and adaptive evolution with low genetic diversity. Genome Biol. Evol https://doi.org/10.1093/gbe/evae174 (2024).
Araki, K. et al. Whole genome sequencing of greater amberjack (Seriola dumerili) for SNP identification on aligned scaffolds and genome structural variation analysis using parallel resequencing. Int. J. Genomics 2018, 7984292. https://doi.org/10.1155/2018/7984292 (2018).
Figueras, A. et al. Whole genome sequencing of turbot (Scophthalmus maximus; Pleuronectiformes): a fish adapted to demersal life. DNA Res. 23, 181–192. https://doi.org/10.1093/dnares/dsw007 (2016).
Ryu, T. et al. A chromosome-scale genome assembly of the false clownfish, Amphiprion ocellaris. G3 (Bethesda) https://doi.org/10.1093/g3journal/jkac074 (2022).
Conte, M. A., Gammerdinger, W. J., Bartie, K. L., Penman, D. J. & Kocher, T. D. A high quality assembly of the Nile Tilapia (Oreochromis niloticus) genome reveals the structure of two sex determination regions. BMC Genomics 18, 341. https://doi.org/10.1186/s12864-017-3723-5 (2017).
Kasahara, M. et al. The medaka draft genome and insights into vertebrate genome evolution. Nature 447, 714–719. https://doi.org/10.1038/nature05846 (2007).
Kelley, J. L. et al. The genome of the self-fertilizing mangrove Rivulus fish, Kryptolebias marmoratus: A model for studying phenotypic plasticity and adaptations to extreme environments. Genome Biol. Evol. 8, 2145–2154. https://doi.org/10.1093/gbe/evw145 (2016).
Schartl, M. et al. The genome of the platyfish, Xiphophorus maculatus, provides insights into evolutionary adaptation and several complex traits. Nat. Genet. 45, 567–572. https://doi.org/10.1038/ng.2604 (2013).
Liu, D. et al. Chromosome-level genome assembly of the endangered humphead wrasse Cheilinus undulatus: Insight into the expansion of opsin genes in fishes. Mol. Ecol. Resour. 21, 2388–2406. https://doi.org/10.1111/1755-0998.13429 (2021).
Yang, C. et al. Screening of genes related to sex determination and differentiation in mandarin fish (Siniperca chuatsi). Int. J. Mol. Sci. https://doi.org/10.3390/ijms23147692 (2022).
Tine, M. et al. European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation. Nat. Commun. 5, 5770. https://doi.org/10.1038/ncomms6770 (2014).
Ao, J. et al. Genome sequencing of the perciform fish Larimichthys crocea provides insights into molecular and genetic mechanisms of stress adaptation. PLoS Genet. 11, e1005118. https://doi.org/10.1371/journal.pgen.1005118 (2015).
Bertolini, F. et al. Whole genome sequencing provides information on the genomic architecture and diversity of cultivated gilthead seabream (Sparus aurata) Broodstock nuclei. Genes (Basel) https://doi.org/10.3390/genes14040839 (2023).
Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297, 1301–1310 (2002).
Nath, S., Shaw, D. E. & White, M. A. Improved contiguity of the threespine stickleback genome using long-read sequencing. G3 Bethesda https://doi.org/10.1093/g3journal/jkab007 (2021).
Jones, F. C. et al. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484, 55–61. https://doi.org/10.1038/nature10944 (2012).
Cheng, C. C. et al. Chromosome-level genome assembly and circadian gene repertoire of the Patagonia Blennie Eleginops maclovinus-the closest ancestral proxy of Antarctic cryonotothenioids. Genes (Basel) https://doi.org/10.3390/genes14061196 (2023).
Kim, B. M. et al. Antarctic blackfin icefish genome reveals adaptations to extreme environments. Nat. Ecol. Evol. 3, 469–478. https://doi.org/10.1038/s41559-019-0812-7 (2019).
Jo, E. et al. Genome survey and microsatellite motif identification of Pogonophryne albipinna. Biosci. Rep. https://doi.org/10.1042/BSR20210824 (2021).
Acknowledgements
We would like to thank Yoko Yamamoto and Nana Shinohara, the Integrative Bioscience Facility, and the Open Research Facilities for Life Science and Technology at the Institute of Science Tokyo for their technical assistance in this study.
Funding
This study was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI, Grant Numbers: 21H02281 (to A.K.) and 21K14781 (to A.N.); Japan Science and Technology Agency (JST) SPRING, Grant number JJPMJSP2180 (to C.O.); and the Temporary Assistant Program by the Support for Work-Life Balance, DEI Section, Office of Communications and DEI, Institute of Science Tokyo (to A.N.).
Author information
Authors and Affiliations
Contributions
K.T. and A.K. conceived and designed the research; K.T., C.O., A.N., M.K., and A.K. performed the experiments; K.T., C.O., A.N., M.K., and A.K. analyzed the data; K.T., C.O., A.N., M.K., and A.K. interpreted the results of the experiments; K.T., C.O., A.N., and A.K. prepared the figures; K.T. and A.K. drafted the manuscript; C.O., A.N., M.K., and A.K. edited and revised the manuscript; and K.T., C.O., A.N., M.K., and A.K. approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Torii, K., Ota, C., Nagashima, A. et al. Intron turnover of slc26a1 and slc26a2 and convergence of intron insertion sites. Sci Rep 15, 30007 (2025). https://doi.org/10.1038/s41598-025-15147-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-15147-w








