Introduction

The Sparidae family consists of a diverse group of fish species found in both marine and brackish-water environments1. The family currently comprises approximately 164 recognized species under 39 genera, with eight new species described within the past decade, highlighting ongoing systematic efforts and the expanding understanding of its biodiversity2. They are commonly known as seabreams or porgies and are widely dispersed across the tropical and subtropical Atlantic, Pacific, and Indian Oceans3. Based on their dental and jaw morphology, sparids often exhibit herbivorous, piscivorous, and carnivorous feeding behaviors4. Moreover, dentition patterns have become widely recognized as critical diagnostic features in the taxonomic classification of the group5. Among this group, the Red pandora, Pagellus bellottii, inhabits demersal waters in shallow depth ranging from 15 to 70 m2,6. This species is, regarded as a vital resource for small to large-scale commercial fisheries for several West African countries, such as Senegal, Mauritius, Ghana, and Guinea. Based on the data reported to the Food and Agriculture Organization (FAO), P. bellottii contributed to 69% among the ~ 2 million tons of total catch of Pagellus genus in FAO Fishing Area 34 (Eastern Atlantic) in last 65 years7,8. Although P. bellottii is currently classified as ‘Least Concern’ by the IUCN, growing global concern has emerged due to its population decline, indicating overexploitation in several West African countries9. Notably, a progressive increase in fishing pressure in FAO Fishing Area 34 has resulted in a downward trend in annual catch since 2005, highlighting the urgent need for sustainable fisheries management measures8,10. Therefore, to ensure the conservation of P. bellottii, one of the most economically important species in this region, the integration of both morphological and molecular approaches is urgently demanded.

With respect to traditional taxonomy, emerging evidence suggests a range expansion of P. bellottii into the Bay of Biscay within FAO Fishing Area 27 (Northeast Atlantic) and the southeastern Mediterranean Sea11,12. This expanded natural distribution of P. bellottii has led to range overlap with its congeners (P. acarne, P. bogaraveo, and P. erythrinus), increasing the likelihood of their co-occurrence and simultaneous harvest within the same fishing grounds7. Hence, the overlapping distributions and shared morphological traits among Pagellus species often pose challenges for accurate species identification13,14. The red pandora, P. bellottii, is typically distinguished from other Pagellus species by its silvery-red coloration, the presence of blue spots along the flanks, and a characteristic dark red mark near the lateral line3,4. Moreover, their systematic position has been frequently revised across different taxonomic groups, often hindering accurate identification and complicating effective management and conservation efforts5. Therefore, the incorporation of molecular approaches is essential for accurate species identification, serving as a prerequisite for systematics research and the development of effective, species-specific fisheries management strategies. So far, several molecular studies have successfully facilitated species identification and characterization within Sparidae, contributing to phylogenetic interpretations, detection of cryptic diversity, and estimation of lineage diversification using partial mitochondrial and nuclear markers, as well as microsatellite DNA15,16,17,18. In addition, recent mitogenomic studies have rapidly expanded worldwide due to their effectiveness in elucidating genomic features, gene-level variation, and enabling comprehensive phylogenetic analyses across diverse groups of marine fishes, including members of the Sparidae family19,20,21. To date, three complete mitogenomes of the Pagellus species have been sequenced, all derived from specimens collected in the Mediterranean Sea22,23,24. As a result, the maternal inheritance-driven evolutionary pathways of these Pandora fishes remain poorly understood, largely due to the absence of mitogenomic data for congeners distributed in the Atlantic and western Indian Ocean. Therefore, the present study aims to (i) generate the first complete mitochondrial genome of the P. bellottii based on specimens from its native range in the Eastern Atlantic Ocean; (ii) assess the structural features and genetic variation among closely related congeners; (iii) evaluate its phylogenetic placement within major Sparidae lineages; and (iv) estimate divergence times and correlate them with historical geological events, present-day ocean currents, and climatic conditions to infer potential drivers of diversification. The findings of this study may serve as a foundational resource for future systematic research within the Sparidae family, while the generated genetic data could support further population genetic studies of Pagellus species. Additionally, the study provides valuable insights into the divergence time estimation of P. bellottii and other congeners, particularly within the biogeographic context of marine ecosystems, thereby contributing to a more comprehensive understanding of their evolutionary history and distribution patterns.

Results

Mitogenome organization and gene arrangement

The mitogenome of P. bellottii is totalling 16,666 bp and is available in GenBank under accession number PQ524309. It consists of 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes, and a noncoding control region (CR). Out of the 37 genes, 28 are placed on the heavy-strand including 12 PCGs, two rRNAs, and 14 tRNAs, while the remaining nine genes (ND6 and eight tRNAs) are accommodated on the light-strand (Fig. 1; Table 1). Comparative analysis revealed the total length of mitogenome within Pagellus ranged from 16,486 bp in P. acarne to 16,941 bp in P. bogaraveo. The gene arrangement in light- and heavy-strands closely mirrors that of other Pagellus congeners. The gene organization is consistent across most Pagellus species, except for P. bogaraveo, which has an extra tRNA-Cys pseudogene within the tRNA gene group located on the light-strand. The mitogenome of P. bellottii shows a noticeable nucleotide composition of AT bias (54.93%) with nucleotide composition values of 27.45% adenine (A), 27.48% thymine (T), 28.28% cytosine (C), and 16.79% guanine (G). Other Pagellus species also exhibits the AT-bias, with P. bogaraveo having the lowest AT bias (53.62%) followed by P. acarne (53.69%), P. bellottii (54.93%) and P. erythrinus (55%). The AT and GC skewness values for P. bellottii were manually calculated as −0.001 and − 0.255, respectively. In comparison, the skew values for other Pagellus species ranged from − 0.002 to − 0.261 in P. acarne, 0.005 and − 0.256 in P. bogaraveo, to 0.0004 and − 0.252 in P. erythrinus (Table 2). The complete mitogenome of P. bellottii includes seven overlapping regions spanning 37 bp, with the longest overlap of 13 bp shared between tRNA-Ser (S2) and COI. Additionally, tRNA-Asn gene shows the largest intergenic space, spanning 38 bp. Comparative analysis showed that three overlapping regions within PCGs are common to all examined Pagellus species, such as both ND4 and ND4L sharing 7 bp, ATP8 and ATP6 sharing 10 bp, and between ND5 and ND6 sharing 4 bp nucleotide (Supplementary TableS1).

Fig. 1
figure 1

(A) Geographic distribution map illustrating the IUCN-defined ranges of four Pagellus species across the Euro-African marine ecosystem, overlaid with major ocean currents and bathymetric features. The collection locality of P. bellottii is indicated with a Ghana circular flag symbol, and the species photograph was taken by the second author (E.O.M.E.). The map was generated using ArcGIS v10.6. global-scale ocean current data were sourced from the NOAA National Weather Service and the U.S. Army (https://data.amerigeoss.org/dataset/major-ocean-currents-arrowpolys-100 m-76). Additionally, bathymetric layers derived from the CMIP6 Earth System Models were obtained from the Bio-ORACLE v3.0 database (https://bio-oracle.org/). (B) Circular representation of the complete mitochondrial genome of P. bellottii, showing gene arrangement through various colored arcs. This circular map was generated using the MitoFish MitoAnnotator web server and manually edited in Adobe Photoshop CS 8.0.

Table 1 Mitochondrial genome organization of P. bellottii, covering strand arrangement, size, and intergenic nucleotide (IN). ‘H’ and ‘L’ indicates the position of each gene in the heavy and light strands. ‘-’ marks indicate an incomplete stop codon.
Table 2 Nucleotide composition of mitochondrial genomes of P. bellottii and other congeners.

Protein-coding genes features

The complete mitogenome of P. bellottii consisting of 13 PCGs (four cytochromes, two ATP synthases, and seven NADH dehydrogenases) with a combined length of 11,443 base pairs, representing 68.7% of the complete mitogenome. Among the PCGs, ATP8 is being the shortest with 168 bp, whereas ND5 appears to be the longest with 1,839 bp (Table 1). Comparative analysis of PCG lengths across other Pagellus species showed a range of 11,439 bp in P. acarne, 11,442 bp in P. bogaraveo, and 11,444 bp in P. erythrinus. An evaluation of AT bias in the PCGs revealed values of 52.95% in P. bogaraveo, 53.34% in P. acarne, 54.56% in P. bellottii, and 54.70% in P. erythrinus. The AT-skewness value ranged from − 0.058 in P. bellottii, −0.097 in P. acarne, and − 0.093 in P. bogaraveo to −0.092 in P. erythrinus. Similarly, the GC-skewness values were calculated as −0.322 in P. bellottii, −0.293 in P. acarne, −0.287 in P. bogaraveo, and − 0.291 in P. erythrinus (Table 2). Nucleotide diversity (π) analysis on PCGs produced the average value of 0.16187, encompassing 3,023 polymorphic nucleotides with ND4 exhibits the highest value of 0.26 (Fig. 2A). Furthermore, saturation analysis revealed that neither transitions nor transversions reached saturation, even with increasing Tamura-Nei 93 (TN93) divergence values across all PCGs in the Sparidae mitogenomes (Fig. 2B). The Open Reading Frame (ORF) finder tool successfully identified the initiation and termination codons for all 13 PCGs. The start codon ATG was identified in 11 PCGs (ND1, ND2, COII, COIII, ATP6, ATP8, ND3, ND4L, ND5, ND6, and Cytb), whereas GTG served as the start codon for COI and ND4. In terms of termination codons, three PCGs (ND1, ND4L, and ND5) terminated with TAA, while COI and ND6 used GTG and TAG, respectively. The remaining eight PCGs featured incomplete stop codons, either with TA- or T–. Comparative analysis revealed that the start codon GTG in COI is conserved across other Pagellus species, whereas the use of GTG as the start codon in ND4 appears to be shared exclusively with P. erythrinus. Notably, the termination codon usage is different among four Pagellus, with seven PCGs having incomplete stop codon in P. acarne, three PCGs in P. bogaraveo, and six PCGs in P. erythrinus (Supplementary Table S2).

Fig. 2
figure 2

Mitogenomic characteristics of the protein-coding genes (PCGs): (A) Nucleotide diversity (π) across PCGs, illustrating genetic variability among the four Pagellus species. (B) Scatter plot of transition (s) and transversion (v) rates relative to genetic divergence in PCGs, calculated using the Tamura-Nei (TN93) distance across all Sparidae species. (C) Box plot showing the mean Ka/Ks ratios for each PCG, with values consistently below 1 across Sparidae, indicating purifying selection.

Substitution pattern and relative synonymous codon usage

The estimation of nonsynonymous (Ka) and synonymous (Ks) ratios in the PCGs indicated that all sparids, including P. bellottii, are subject to similar levels of selection pressure. The Ka/Ks ratio analyses across all sparid species revealed that all PCGs exhibited ratios below 1, indicating purifying selection. The mean Ka/Ks values ranged from 0.011 ± 0.004 for COI, representing the lowest ratio, to 0.074 ± 0.018 for ND2, representing the highest. Subsequently, the Ka/Ks ratios across all sparids follow the order: COI < Cytb < COII < ND1 < COIII < ND5 < ND3 < ND4L < ND4 < ATP6 < ND6 < ATP8 < ND2 (Fig. 2C, Supplementary Table S3). Moreover, the Relative Synonymous Codon Usage (RSCU) analysis showed that the combination of nucleotides was transcribed into specific amino acids in the PCGs of four Pagellus. In total, the number of amino acids in P. bellottii PCGs are 3,608 without the stop codons. The amino acid contributors were primarily dominated by Leucine (14.4%) and Alanine (6.3%) with hydrophobic hydropathy characteristic, followed by three neutral hydropathic amino acids, such as Proline (11.4%), Serine (10.8%), and Threonine (6.9%). Meanwhile, two hydrophobic amino acids, cysteine (1.5%), and tryptophan (2.9%) and three hydrophilic amino acids, aspartic acid (2.1%), glutamic acid (2.2%), and lysine (3.0%) were found as less abundant (Fig. 3A, Supplementary Table S4). Comparative analysis showed that the amino acid composition with hydrophobic properties is more dominant compared to others. Two hydrophobic amino acids, such as Leucine and Alanine, followed by three neutral hydropathic amino acids (Proline, Serine, and Threonine) are commonly observed in all four Pagellus species. Notably, six codon variants of leucine and serine were identified, contributing to an increased abundance of these amino acids across all four Pagellus species. Moreover, several codons exhibited higher relative usage, indicated by values greater than 1.5, reflecting their preferential contribution to the translation of specific amino acids. For example, GCC for alanine, CTT for leucine, TCT for serine, and ACC for threonine showed elevated usage in all four Pagellus species (Fig. 3B, Supplementary Table S4).

Fig. 3
figure 3

Structural characteristics of amino acids in PCGs across four Pagellus species: (A) Comparative amino acid composition. (B) Relative synonymous codon usage (RSCU) analysis, highlighting codon preferences contributing to the translational efficiency of each amino acid.

Ribosomal RNA and transfer RNA structures

The mitogenome of P. bellottii includes two rRNA genes: the 12 S rRNA (small subunit) and the 16 S rRNA (large subunit), spanning a total of 2,650 bp, which accounts for 15.9% of the overall mitogenome length. Comparative analysis showed that the rRNA length in P. erythrinus is conserved, whereas P. bogaraveo and P. acarne possess slightly shorter rRNAs (2,648 bp). The rRNA genes showed a bias towards AT, ranging from 52.5% in P. acarne to 54.0% in P. bellottii. Additionally, AT-skewness value ranged from 0.191 to 0.212 and GC-skewness values ranged from − 0.087 to − 0.071 among all four Pagellus species (Table 2).

The 22 tRNA genes within the mitogenome of P. bellottii are scattered between rRNA and PCGs regions. It has a collective size of 1,556 bp, accounting for 9.3% of the total mitogenome length. All 22 tRNAs demonstrated a bias towards AT content of 54.5%, showing an AT-skewness and a GC-skewness value of 0.087 and − 0.102, respectively (Table 2). Furthermore, the secondary structure of the majority of tRNA molecules displayed the typical cloverleaf shape, with the exception of tRNA-Ser1, which failed to form this structure due to the absence of a nucleotide bond in its DHU arm. A total of 17 tRNA genes were found to be structured through a combination of standard Watson-Crick base pairing and Wobble base pairing (G-T, T-T, G-G, G-A), leading to an interesting blend of conventional and mismatched pairings in their overall configuration. In contrast, the remaining five tRNA genes rely exclusively on Watson-Crick base pairing (Supplementary Fig. S1). Comparative analysis showed that all four Pagellus species have a similar anticodon pattern in all 22 tRNA genes (Supplementary Table S5).

Structures of control region

The CR of P. bellottii spans 987 bp located in between tRNA-Pro and tRNA-Phe, contributing 5.9% of the complete mitogenome. Comparative analyses showed that the CR length varies among four Pagellus species, with P. bogaraveo having the longest one (1,195 bp), followed by P. erythrinus (1,154 bp), and P. acarne (781 bp). Notably, the CR of all four Pagellus species showed a similar portion of AT content ranged from 62.5% in P. erythrinus to 63.9% in P. acarne, with AT skewness varies among Pagellus, ranging from − 0.006 in P. bellottii to 0.029 in P. bogaraveo (Table 2). Since the CR showed a bias towards AT, the GC content is low, resulting in GC-skewness ranging from − 0.233 in P. bogaraveo to −0.135 in P. acarne.

The CR of four Pagellus also incorporates four conserved sequence blocks (CSB): CSB-1, CSB-2, CSB-3, and CSB-D. Among these conserved domains, CSB-1 is the longest (21 bp) compared to the others, ranging from 18 bp (CSB-2 and CSB-D) to 19 bp (CSB-3). Comparative analysis of CSB-D showed that P. bellottii, P. acarne, and P. bogaraveo were classified in similar groups, showing a similarity in conserved nucleotides pattern, while P. erythrinus has a single base pair deviation compared to its congeners. In CSB-1, P. bellottii and P. bogaraveo exhibited conserved nucleotide sequences, whereas the other two species displayed nucleotide variations at three positions. In CSB-2 and CSB-3, P. bellottii and P. erythrinus showed a consistent pattern in conserve nucleotide, while P. bogaraveo had 1 bp and 3 bp differences in CSB-2 and CSB-3, respectively. Unfortunately, due to the shorter sequence of CR in P. acarne, the comparison of nucleotide composition within the CSB-2 and CSB-3 is restricted with other three Pagellus species. Moreover, the CR of P. bellottii and P. acarne showed no repetitive sequences, while the CR of the two other congeners exhibited long repetitive nucleotide patterns in the extended termination-associated sequences (ETAS) region, with P. bogaraveo having two copies of 183 bp consensus base pairs and P. erythrinus having two copies of 160 bp consensus base pairs (Fig. 4).

Fig. 4
figure 4

Nucleotide features of the control regions (CRs) in the mitogenomes of four Pagellus species, emphasizing conserved sequence blocks. Highly conserved nucleotides are denoted by black asterisks. The graphical panel below depicts the presence or absence of tandem repeats, along with their sequence characteristics in different Pagellus species.

Matrilineal phylogeny and divergence time

Both Bayesian (BA) and Maximum Likelihood (ML) phylogenetic trees revealed a cohesive clustering of Sparidae family members, supported by high posterior probabilities and bootstrap values, in contrast to the outgroup taxa from Lethrinidae and Nemipteridae (Fig. 5, Supplementary Fig. S2). Within the Sparidae, species members of the Pagellinae, Sparinae, Denticinae, and Boopsinae subfamilies exhibited clear non-monophyletic clustering. Notably, two major clusters were identified among all sparid species, one cluster comprises species from the genera Dentex, Evynnis, Pagellus, Pagrus, Argyrops, Polysteganus, and the monotypic genus Parargyrops, while the other includes species from Calamus, Stenotomus, Archosargus, Boops, Rhabdosargus, Diplodus, Acanthopagrus, Pagellus, and the monotypic genera Lagodon and Sparus. Specifically, the phylogenetic trees revealed a non-monophyletic cladistics pattern in three genera: Pagrus, Dentex, and Pagellus. Within the genus Pagellus, P. bellottii exhibits the closest phylogenetic relationship with P. erythrinus, while P. bogaraveo and P. acarne are resolved as sister species based on the current mitogenomic-based cladistic analysis. Furthermore, divergence time estimation for sparid species indicated a split between the two major clusters at approximately 28.8 million years ago (MYA), likely occurring during the Oligocene epoch. The divergence between P. bellottii and P. erythrinus was estimated to have occurred around 6.1 MYA, during the Miocene epoch (Fig. 6). As this event is more recent than the divergence between P. bogaraveo and P. acarne, which occurred approximately 10.5 MYA in the Atlantic Ocean and Mediterranean Sea, P. bellottii and P. erythrinus can be considered the ‘younger sister’ species among the four Pagellus congeners.

Fig. 5
figure 5

Bayesian phylogeny based on complete mitogenomes, depicting the non-monophyletic evolutionary relationships of the four Pagellus species within the Sparidae lineage. Posterior probability values are indicated at each node. Subfamilial classifications are represented by colored boxes adjacent to each clade. Distribution ranges of the four species are integrated with the phylogeny to visualize their biogeographic patterns across the Atlantic Ocean and Mediterranean Sea. All distribution maps were produced using ArcGIS version 10.6 based on IUCN data. The photograph of P. bellottii was taken by the second author (E.O.M.E.), while images of P. erythrinus, P. bogaraveo, and P. acarne were sourced from Wikimedia Commons (Creative Commons Attribution 4.0) and edited in Adobe Photoshop CS 8.0.

Fig. 6
figure 6

(A) Maximum Likelihood-based timetree estimating the divergence times among Pagellus congeners, with divergence estimates denoted by red font at each node. Calibration points (orange diamonds) are based on previous fossil-derived data42indicating the divergence between P. acarne and P. bogaraveo during the Messinian-Miocene epoch (lower 7–upper 14 Mya). (BD) Paleogeographic maps depicting tectonically driven changes in the Atlantic–Mediterranean gateway from the middle Miocene to the present. The black arrows trace the inferred drift path of the Alboran Plate from the early-middle Miocene through the formation of the Gibraltar Arc. The source map was obtained from previously published literature48and edited manually with prior permission granted by Walter Capella through personal communication. (E) Map showing the present-day sea surface temperature map of the Atlantic Ocean, generated using ArcGIS version 10.6 with environmental layers sourced from Bio-ORACLE v3.0 database (https://bio-oracle.org/)and manually refined in Adobe Photoshop CS 8.0.

Discussion

Despite variations in total length, the mitochondrial genomes of the four Pagellus species exhibit notable similarity, encompassing 37 genes and a control region. The organization and arrangement of these genes align with the typical mitochondrial architecture observed in other teleosts, including Spariformes25. The mitogenomic base composition of four Pagellus with AT-bias is also similar to vertebrates, including teleosts26. It is aligned with the hydrophobic character in mitochondrial proteins27. The results indicate the mitogenome of P. bellottii is conserved similarly to other Pagellus or sparids22,23,24,26. The overall organization of mitochondrial genomes, particularly gene order, plays a critical role in inferring gene rearrangement events, as comparisons with the putative ancestral gene order provide essential insights into organismal evolutionary trajectories28. Consequently, comprehending gene organization and base composition in mitogenome is pivotal for enhancing the accuracy of species classification. Additionally, placement of mitochondrial genes can impact various biological aspects in fish, including physiology, life history, molecular mechanisms, and other evolutionary processes29.

The start and stop codon pattern of Pagellus species is consistent with investigation in other subfamilies within the Sparidae family26. In this study, amino acids with hydrophobic characteristics are more frequently, with Leucine being found as the most frequent than others with hydrophilic or neutral hydropathy properties. The observed high abundance of hydrophobic amino acids is consistent with findings reported in other teleost species30. The hydropathy characteristics of amino acids fundamentally shape the evolution of mitochondrial proteins, which are vital for regulating tissue-level metabolism and enabling organisms to adapt to fluctuating environmental conditions, thereby underscoring their evolutionary significance31. Given the frequent environmental fluctuations in the Atlantic Ocean and their potential impact on the observed range expansion of P. bellottii, further investigations into the hydropathy characteristics and adaptive modifications of mitochondrial proteins are warranted. Such investigation could provide valuable insights into the role of these modifications in improving protein configuration and their functional purposes under various changes of environmental stressors. Such mitogenomic information facilitates the inference of adaptive mechanisms and metabolic responses that underpin the resilience and tolerance of aquatic organisms, including Pagellus species, within the highly variable habitats of tropical and subtropical Atlantic and Mediterranean marine ecosystems32,33.

Moreover, the present findings demonstrate that the codon usage patterns of PCGs are conserved across all four Pagellus species, and that various codon combinations contribute to the translation of specific amino acids. Overall, the RSCU values deviated from 1, indicating varying degrees of codon usage bias among different codons in the Pagellus species. This codon usage bias is closely associated with gene expression levels; highly expressed genes tend to utilize specific codons more frequently than genes with lower expression efficiency34. Compared to low-expression genes, highly expressed genes exhibit a stronger codon usage bias and preferentially employ a subset of synonymous codons. The findings of this study suggest that PCGs in Pagellus species are conserved and likely maintain similar functional roles. In various organisms, codon usage bias arises from a combination of mutation pressure, natural selection, and genetic drift. Furthermore, codon bias is influenced by multiple factors, including tRNA abundance and interactions, recombination rates, mRNA secondary structure, codon position and context within genes, GC content, gene expression levels, gene length, and overall genomic composition35.

Furthermore, the calculation of Ka/Ks ratio is a widely acknowledged method for estimating selective pressure in line with Darwinian theory, providing insights into selection pressure at the molecular level across both same and closely related taxa35. This selection pressure is fundamental for understanding the evolutionary dynamics of genes subjected to selective forces and plays a crucial role in elucidating species divergence. Therefore, detailed analysis of the Ka/Ks substitution ratios in PCGs provides insight into the selection pressures acting on each gene, a ratio greater than 1 indicates positive selection, a ratio equal to 1 suggests neutral evolution, and a ratio less than 1 reflects purifying (negative) selection36. In this study, the Ka/Ks ratios in PCGs were less than 1, indicating strong purifying selection to preserve their functional integrity, with the majority of mutations occurring as synonymous substitutions across all species within the Sparidae family. Among the 13 PCGs, the ND2 gene exhibited the highest Ka/Ks ratio, followed by ATP8 and ND6, suggesting that ND2 has undergone the least selective constraint and evolved at a relatively faster rate. In contrast, COI displayed the lowest Ka/Ks ratio, followed by Cytb, indicating strong purifying selection and slower evolutionary rates. These findings highlight the role of natural selection in eliminating deleterious nucleotide mutations through negative selective pressure, consistent with patterns previously observed in teleost species37,38,39. The Ka/Ks analysis of Pagellus and related species provides a valuable framework for understanding the subtleties of natural selection. The strong purifying selection acting on PCGs enhances the evolutionary resilience of Pagellus species, contributing to genetic stability and facilitating speciation through ecological adaptation and colonization. This process reflects the evolutionary trajectory and native distribution of Pagellus species, offering insights into the complex interplay between mutational events and selective pressures in shaping their evolution. Such interactions are fundamental to protein evolution, as demonstrated by the composition and relative abundance of PCGs25.

The complete mitochondrial genome of P. bellottii contains both the small and large rRNA subunits encoded on the heavy strand, consistent with the mitochondrial gene arrangement observed in other Pagellus species23,24. These rRNA genes, which are generally maintained as ribonucleoproteins, play a pivotal function during genetic information translation from mRNAs into amino acids, providing constructive understanding into the essential mechanisms to accelerate the protein synthesis40. The tRNAs in P. bellottii predominantly exhibited the canonical secondary structures, except for tRNA-Ser1, which deviated due to missing or mismatched base pairs in the DHU arm. This missing base-pairing feature is commonly detected in the vertebrates, including the order Spariformes22,23,24,25. Comparative analysis revealed that the organization and structure of tRNA genes are conserved among the three Pagellus species, except in P. bogaraveo, which contains a 66 bp tRNA-Cys pseudogene within the WANCY region22. This distinct organization of tRNA genes and the presence of heteroplasmy in the WANCY region are critical for the expression and functional integrity of mitochondrial tRNA genes22.

The CR of P. bellottii exhibits an A-T bias, consistent with other Pagellus congeners and sparid species, and contains four conserved sequence blocks, as observed in sparid fishes like Pagrus major21,22,23,24. The investigation of these conserved domains is significant due to the presence of highly conserved nucleotides within the highly variable CR. Therefore, the nucleotide variations identified within these conserved domains serve as critical markers that can be employed for species-level differentiation40. The nucleotide composition of the CR in Pagellus species, including the detection of polymorphic sites within conserved blocks, offers valuable insights for the development of specific molecular markers for species identification and for elucidating population structure. Furthermore, elucidating the complex mechanisms governing the CR including random loss, dimer formation, non-random loss, duplication, and genomic rearrangements is crucial for comprehending the structural diversity of mitochondrial genes and the evolutionary dynamics25. The detection of tandem repeats within the ETAS region of P. bogaraveo and P. erythrinus further highlights notable diversity and distinct repeat patterns that likely contribute to the formation of stable hairpin loops, which are recognized as sequence-specific signals involved in regulating mitochondrial DNA replication termination41.

The present cladistic framework derived from mitogenome data corroborates earlier evolutionary hypotheses concerning the non-monophyletic nature of Sparidae21 as well as certain genera including Pagellus, as previously inferred from partial mitochondrial and nuclear markers42,43,44,45. These findings are also consistent with phylogenomic analyses based on six mitochondrial and 21 nuclear genes, which similarly reveal a non-monophyletic pattern among Pagellus species in the marine environment46. Our findings indicate that P. bellottii is closely related to P. erythrinus, while exhibiting a more distant phylogenetic relationship to the other two congeners. This observed phylogenetic affinity between P. bellottii and P. erythrinus may be further substantiated by the notable morphological similarities shared between these two species. Both P. bellottii and P. erythrinus exhibit a more laterally compressed body shape, whereas the other two species display a more oblong and fusiform morphology. Additionally, the pattern of scalation on the dorsal surface of the head terminates anterior to a transverse line through the middle of the eye in P. bellottii and P. erythrinus, in contrast to the posterior termination in P. bogaraveo and P. acarne. The mouth configuration in P. bellottii and P. erythrinus is low-set and slightly oblique, while it is low-set and nearly horizontal in the other two species. The number of gills rakers on the lower and upper limbs of the first arch ranges from 8 to 10 and 5–6, respectively, in P. bellottii and P. erythrinus, whereas higher counts are observed in the other two species. Lateral line scale counts further support this relationship, with P. bellottii exhibiting 54–60 and P. erythrinus 55–65 scales, in contrast to the significantly higher counts in P. bogaraveo (68–74) and P. acarne (65–72). Furthermore, coloration of the oral cavity also differs as it is typically whitish to greyish in P. bellottii and P. erythrinus, while P. bogaraveo and P. acarne display a distinct orange-red hue13,14. Hence, both the phylogenetic relationships and morphological characteristics provide a more comprehensive understanding of the evolutionary affinities between P. bellottii and P. erythrinus, particularly in contrast to the more distantly related congeners P. bogaraveo and P. acarne.

Notably, divergence time estimates indicate an approximate 4.4 MYA interval between the initial divergence of P. acarne and P. bogaraveo, and the subsequent split between P. bellottii and P. erythrinus. These findings suggest that evolutionary processes and reproductive isolation within the Pagellus lineage have occurred over an extended timescale in the marine environment, highlighting the long-term nature of sparid evolution as reported in previous studies42,47. Based on the estimated divergence times, it can be hypothesized that geological events in the Atlantic–Mediterranean gateway and the formation of Gibraltar Arc may have driven the speciation and lineage diversification of Pagellus in both the Eastern Atlantic Ocean and the Mediterranean Sea. The split divergence of Pagellus species can be traced back to the late Miocene period corresponding to the disconnection of the Atlantic Ocean and Mediterranean Sea followed by the hypersaline event also known as the Messinian salinity crisis48. Subsequently, the connection between the Atlantic and Mediterranean was re-established during the Pliocene (5.4 to 1.8 MYA) leading to a shared geographical distribution for major sparids42,47. It is worth discussing that three Pagellus species are widely distributed across the Mediterranean Sea and the northeastern Atlantic Ocean, whereas P. bellottii has extended its native range into the southeastern Atlantic. This asymmetric distribution pattern of P. bellottii may be influenced by hydrographic and climatic conditions in the North and South Atlantic Oceans, which have gradually diverged due to the Coriolis effect driven by Earth’s rotation49,50. In the North Atlantic, distinct oceanic gyres form due to oceanic current circulation, with the warm Gulf Stream current flowing away northward and the cold Canary current flowing southward, restricting the distribution of tropical marine fishes species towards the northward region49,50. In addition, the existence of cold-Benguela and warm-Agulhas currents along with the South Equatorial warm currents in southern coast of Africa are profound primary physical barrier which may restrict the overlap distribution of the Atlantic Pagellus species with its congeners (P. natalensis and P. affinis) distributed in Western Indian Ocean51.

Furthermore, the recent northward expansion of P. bellottii into the northeastern Atlantic and southeastern Mediterranean Sea could be attributed to abiotic factors, such as increasing sea surface temperatures, which may be influencing its distribution. A similar pattern of northward range expansion has been observed in several tropical marine fish species, likely driven by the accelerated warming of marine environments over recent decades52. This indicates that the cold-water current passing through the Canary Islands warms up, may enable P. bellottii to expand its distribution range toward the Bay of Biscay52. Moreover, the range expansion of P. bellottii to Haifa Bay in the southeastern Mediterranean Sea suggests that this species might possess a high tolerance to salinity, as the Mediterranean Sea has exhibited higher salinity levels than the Atlantic Ocean since the end of the Messinian Salinity Crisis49,50.

Limitations and future direction

The present study provides the first complete mitochondrial genome of P. bellottii from the Eastern Atlantic Ocean and renders insights into its structural characteristics and phylogenetic placement within the major Sparidae clade; however, several limitations persist that warrant attention in future research. Notably, the current analysis is based on a single specimen collected from Ghana, which may not adequately capture the species’ genetic diversity or its phylogeographic structure, given its broader distribution from the Eastern Atlantic to the Bay of Biscay in the northward Atlantic region and the south-eastern Mediterranean Sea. Future investigations should therefore incorporate extensive surveys and sampling across the full geographic range of the species, including multiple individuals from different populations. This approach would enable a more in-depth understanding of population structure, gene flow, and demographic history by incorporating both partial and complete mitochondrial genetic data. Moreover, this study incorporates mitogenomic data only from four out of six currently recognised Pagellus species worldwide. Hence, the addition of complete mitogenomes for P. affinis and P. natalensis, which inhabit the Western and Southwestern Indian Ocean, is essential for clarifying the comprehensive matrilineal evolutionary relationships and lineage diversification within this genus. Additionally, the application of whole-genome sequencing and genome-wide single nucleotide polymorphism (SNP) analyses would enable more precise inferences of genetic variation, population divergence, and adaptability of Pagellus species in diverse marine ecosystems. Ultimately, an integrative approach encompassing morphological and genetic analyses, biogeographic patterns, and geological history will yield deeper insights into the evolutionary trajectory and present-day population structure of this species. Such findings will significantly contribute to marine ichthyological systematics, conservation strategies, and aquaculture development.

Conclusion

In the present study, the complete mitochondrial genome of P. bellottii is reported for the first time. The comprehensive analyses offer an in-depth understanding of the genetic structure and variations of this species compared with its closely related congeners. The phylogenetic reconstruction further elucidates the non-monophyletic clustering of Pagellus and inferring the evolutionary relationships within the broader Sparidae family. Considering the overlapping geographic distributions of four Pagellus species and the recently documented range expansion of P. bellottii into the Bay of Biscay (northeastern Atlantic) and Haifa Bay (southeastern Mediterranean), this study discusses potential lineage diversification driven by Miocene-era geological events, subsequently influenced by oceanic currents and ongoing climate change. Overall, the findings underscore the significance of integrating mitogenomic data and demographic history to understand matrilineal speciation and diversification processes, particularly in the context of the biogeographical complexity of the Atlantic Ocean and Mediterranean Sea.

Materials and methods

Sample handling and species identification

A sample of P. bellottii were collected from Tema, Ghana (5.611389 N, 0.044444 W) (Fig. 1). Muscle tissues between the area of the lateral line and the dorsal fin were carefully excised under sterile conditions, placed in 95% ethanol, and stored in 2 ml tubes. To prevent cross-contamination and DNA hydrolysis, the tissue sample was stored in freezer at − 20 °C. Later on, the specimen was vouchered in 10% formaldehyde and deposited at the Fisheries Scientific Survey Division under Ghana’s Ministry of Fisheries and Aquaculture Development (MOFAD). The study design and all methods were conducted in compliance with the relevant guidelines and regulations of ARRIVE 2.0 (https://arriveguidelines.org), specifically for animal research53. Then species identification was confirmed through several key morphological features3,4. All laboratory examinations were conducted by using the relevant guidelines and regulations approved by the Pukyong National University Institutional Animal Care and Use Committee (PKNU-IACUC) under Approval No. PKNUIACUC-2025-16, dated February 18, 2025.

DNA extraction and COI marker sequencing

The extraction of the genomic DNA was performed using the AccuPrep® DNA Extraction Kit (Bioneer, Republic of Korea) according to the standard instructions provided by the manufacturer. A paired-set primer Fish-BCH (5′-TCAACYAATCAYAAAGATATYGGCAC-3′) and Fish-BCL (5′-ACTTCYGGGTGRCCRAARAATCA-3′) was used to amplify the targeted mitochondrial COI partial marker (~ 650 bp)54. Polymerase Chain Reaction (PCR) was conducted using a Takara thermal cycler with a 30 µL reaction mixture consisting of 1 µL each of forward and reverse primers, 0.9 µL of 3% dimethyl sulfoxide (DMSO), 19.9 µL of sterilized deionized water, 3 µL of 10× ExTaq Buffer, 0.2 µL of Ex Taq HS enzyme, 3 µL of dNTPs, and 1 µL of 1/10 diluted target DNA template. The thermal conditions included an initial denaturation at 94º C for 3 min, followed by 40 cycles of denaturation at 94º C for 30 s, annealing at 50º C for 30 s, extension at 72º C for 1 min, and a final extension at 72º C for 5 min. Gel electrophoresis with 1.5% agarose was applied to visualize the DNA integrity, followed by a DNA purification assessment using NanoDrop spectrophotometer (Thermo Fisher Scientific, USA). Successfuly amplified PCR product was sent and read using the Sanger sequencing at Macrogen (Republic of Korea). The PCR product was purified using the AccuPrep® Purification Kit (Bioneer, Republic of Korea) to remove impurities, and then sequenced bidirectionally at Macrogen (Daejeon, Republic of Korea). The COI sequence was subjected to species examination based on sequence query and similarity using nucleotide BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and was submitted in the GenBank database (Accession number PQ242656).

Mitogenome sequencing and annotation

The high-throughput next-generation sequencing (NGS) with NovaSeq platform was employed for complete mitogenome analysis. A paired-end library with size 2 × 150 bp was generated through the TruSeq library preparation kit for Illumina® (Illumina, USA). About 100 ng of extracted DNA was fragmented using a Covaris ultrasonicator technology to generate double-helix DNA with straight ends and 5′ phosphorylation. After end reparation and ‘A’ tailing, the fragments were ligated as stubs with TruSeq UD DNA Indexing adapters. To construct the final library, the purification and PCR enrichment of the resulting products were performed using a bead-based method. Library quantification and size distribution quality assessment were performed using KAPA Library Quantification Kits for Illumina and the Agilent Technologies 4200 TapeStation D1000 (Agilent Technologies, USA), with sequencing conducted by Macrogen.

The NGS reading and assembly were constructed using Geneious Prime and referred to the published P. erythrinus mitogenome (Genbank Accession No. MG653592) using default mapping algorithm. The PCGs were validated by adjusting overlapping regions with MEGA X55. The genes placements and boundaries were determined by employing the MITOS Galaxy version 1.1.656 and MitoFish MitoAnnotator webserver57. Moreover, the boundaries of each PCG, including the start and stop codons were verified using the Open Reading Frames (ORFs) Finder webtool in GenBank (https://www.ncbi.nlm.nih.gov/orffinder), with standard genetic code for vertebrate mitochondrial analysis in conjunction with MITOS Galaxy webserver. The final annotated P. bellottii mitogenome was submitted to the GenBank database (Accession number PQ524309).

Characterization and comparative analyses

The mitogenome of P. bellottii was represented as a circular map using MitoFish MitoAnnotator webserver. Comparative analyses were performed to assess mitogenomic structural variations in P. bellottii with three other congeners within Pagellus genus: P. bogaraveo (Genbank Accession No. AB305023)22P. acarne (GenBank Accession No. MG736083)23and P. erythrinus (GenBank Accession No. MG653592)24. Base compositions of PCGs, tRNA, rRNA, and the CR were evaluated using MEGA X, and intergenic and overlapping regions between contiguous genes were calculated manually. The nucleotide skewness, specifically the AT and GC skews, was calculated using the following formulas: AT-skew = (A − T)/(A + T) and GC-skew = (G − C)/(G + C), respectively58.

Nucleotide diversity (π) for all Pagellus species was scrutinized using a sliding window method in DnaSP ver 6.059, applying a 25 bp step size, which is the size of the shifted sequence within each window, and a 200 bp window size, which means the sequence size at each time. By using these sizes, the accuracy of determining the location of protein coding regions in eukaryotes can be significantly improved60. The DnaSP software was also employed to calculate the pairwise substitution patterns in PCGs among four Pagellus species and other Sparidae family members, including the ratio of Ks and Ka. In addition, the transition and transversion codon saturation rates across the PCGs of all Sparidae species were assessed using the DAMBE6 software61. To analyze the RSCU and the distribution of amino acids within the PCGs, the dataset of all Pagellus species was aligned using MEGA X. The secondary structures of tRNAs were reconstructed using the tRNA prediction tools in ARWEN 1.262 and ARAGORN63 which were integrated into the MITOS Galaxy web server. Furthermore, the conserved blocks within the CR were identified using CLUSTAL X64 in MEGA X referring to the published CR of P. major in a previous study40. As repeat sequences serve a vital role for further development markers in population structures studies, we employed the Tandem Repeats Finder program to detect the tandem repeats pattern within the CR65.

Dataset construction and phylogenetic analyses

To construct the matrilineal lineage of four Pagellus and other sparids, a dataset encompassing the complete mitogenomes of 32 species was assembled, including a newly sequenced mitogenome from this study (Supplementary Table S6). In addition, two mitogenomes of Lethrinus obsoletus (Accession No. AP009165)20 and Nemipterus japonicus (Accession No. KJ473717)66 under the families Lethrinidae and Nemipteridae, respectively were designated as outgroups. To investigate matrilineal phylogenetic relationships within Pagellus and other Sparidae species, a concatenated dataset of 13 PCGs were generated using the Concatenator Tool in iTaxoTools ver 0.167. The ‘GTR + G + I’ model was determined to be the most suitable based on the lowest Bayesian Information Criterion (BIC) value, as assessed using both PartitionFinder ver 268 and the JModelTest program69. Then, phylogenetic trees were constructed by Bayesian analysis employing a Metropolis-coupled Markov Chain Monte Carlo (MCMC) algorithm in MrBayes 3.1.270. The ML-based phylogenetic tree was made with standard settings in PhyML 3.0 online server71. The two phylogenetic trees were then saved in Newick format (*.nwk) and visualized using Interactive Tree of Life (iTOL)72.

Divergence time estimation

The estimation of diverging times among sparids lineage, including the four Pagellus species, were calculated by RelTime73 method with the ML topology as the baseline tree in MEGA 1274. The RelTime method is widely recognized for its efficiency and accuracy in divergence time estimation, making it particularly well-suited for analysing large datasets with reduced computational time75. Thus, a known fossil calibration was applied to constrain the analysis based on the splitting age between P. acarne and P. bogaraveo by determining the fossil of Pagellus leptosomus from the Messinian diatomites of the Chelif Basin, Algeria (about 7 MYA) as the lower splitting age of both species. Additionally, fossils of Boops sp. were also used as the upper boundary from the Messinian diatomites and middle Miocene (about 14 MYA), as described in the previous study42. The generated timetree was further refined and visually enhanced using MEGA12 to improve clarity and presentation quality.

Geospatial data acquisition and processing

The generation of the maps entailed a multi-step process involving the acquisition and processing of both vector and raster datasets. Initially, global administrative boundary shapefiles were downloaded from the DIVA-GIS platform (https://diva-gis.org/data.html), which offers high-resolution vector data suitable for spatial analysis. These shapefiles were subsequently imported into ArcMap within the ArcGIS v10.6 environment for further geospatial processing and overlay analyses. Global-scale Ocean current data were sourced from the NOAA National Weather Service and the U.S. Army (https://data.amerigeoss.org/dataset/major-ocean-currents-arrowpolys-100 m-76; accessed on 15 May 2025). Additionally, bathymetric and sea surface temperature layers from the CMIP6 Earth System Models were obtained via the Bio-ORACLE v3.0 database (https://bio-oracle.org/)76. All spatial datasets were reprojected to the WGS 84 geographic coordinate system to ensure interoperability and spatial consistency. The processed datasets were then integrated and exported for the creation of the final thematic maps. In addition, paleogeographic maps illustrating tectonically driven changes in the Atlantic–Mediterranean gateway and the evolution of the Gibraltar Arc from the middle Miocene to the present were manually adapted from previously published literature48.