Introduction

The phylum Chordata is composed of three major subphyla: Vertebrata (vertebrates), Cephalochordata (cephalochordates), and Urochordata (tunicates). Among them, the subphylum Cephalochordata represents an early-diverging lineage within chordates and is represented today solely by amphioxus, a small, marine, benthic organism that has remained morphologically conserved over hundreds of millions of years. Cephalochordates are regarded as the most proximate living relatives of vertebrates, sharing fundamental chordate features such as a notochord, dorsal nerve cord, pharyngeal slits, and segmented muscle blocks (myomeres), but lacking complex organs such as a true brain, heart, and paired appendages. Due to its early-diverging phylogenetic position and conserved body plan, amphioxus—commonly referred to as lancelet—is often termed a “living fossil” and serves as a critical model organism for studying the evolutionary transition from invertebrates to vertebrates1,2. Amphioxus occupies a unique position in evolutionary and developmental biology, offering essential insights into the ancestral state of chordates. Its significance as a model organism is well recognized, particularly in the fields of comparative genomics, Evo-Devo (evolutionary developmental biology), and molecular phylogenetics. The sequencing of the amphioxus genome, particularly that of Branchiostoma floridae, has revealed an exceptionally conserved genomic architecture that parallels that of early chordates3,4. Unlike vertebrates, amphioxus has not undergone whole-genome duplications (WGDs), which are considered a key factor in vertebrate complexity. The lack of WGD makes the amphioxus genome an ideal reference for reconstructing ancestral chordate gene structures and understanding the genomic innovations that accompanied the origin of vertebrates5,6.

Genomic studies have identified numerous gene families and regulatory networks that are conserved between amphioxus and vertebrates, including genes involved in neural patterning, somite formation, and endocrine signaling2,7. These studies suggest that the fundamental genetic toolkit required for vertebrate body plan development was already present in the common ancestor of cephalochordates and vertebrates. Moreover, transcriptomic and epigenomic data have further highlighted the deep conservation of gene expression patterns and chromatin organization, emphasizing the utility of amphioxus in comparative studies of gene regulation. Morphologically, amphioxus is characterized by a slender, laterally compressed, and semi-transparent body typically ranging from 2 to 8 cm in length. It retains several chordate features throughout its life, including a dorsal nerve cord, a persistent notochord that extends from head to tail, and a series of pharyngeal slits used in filter feeding8,9. The organism lacks highly differentiated organ systems found in vertebrates. Instead of a centralized brain, amphioxus possesses a simple cerebral vesicle, and its circulatory system is devoid of a heart, relying on contractile vessels to propel blood10. Reproduction is generally sexual with external fertilization, and developmental processes are well-characterized, showing remarkable similarities to vertebrate embryogenesis during early stages11,12. Ecologically, amphioxus inhabits shallow coastal waters, typically in sandy or muddy sediments at depths ranging from 5 to 50 m. It exhibits burrowing behavior, usually embedding itself vertically in the substrate with only its anterior end protruding to facilitate filter feeding13,14. Amphioxus is primarily nocturnal, remaining buried during the day to avoid predation and becoming more active at night. It prefers fully marine conditions with salinities ranging between 30 and 35 PSU and is commonly found in oxygen-rich, transparent waters. Its diet mainly consists of diatoms and protozoans such as Coscinodiscus, Navicula, Cyclotella, Chaetoceros, and Nitzschia. Buccal cirri surrounding the mouth help trap microscopic food particles, which are then directed through pharyngeal slits into the digestive tract while excess water is expelled15,16. Taxonomically, the classification of amphioxus has long been a subject of debate, particularly due to the intraspecific variability in morphological traits such as body size, pigmentation, and myomere count. Currently, the subphylum Cephalochordata includes three recognized genera: Branchiostoma, Asymmetron, and Epigonichthys. Of these, Branchiostoma is the most species-rich, with over 20 described species9,17. The genera Asymmetron and Epigonichthys are comparatively less diverse but exhibit distinctive morphological and reproductive traits. Geographically, Branchiostoma species are broadly distributed across the Indo-West Pacific, the Mediterranean, and the Atlantic Ocean, while Asymmetron and Epigonichthys are generally confined to tropical and subtropical regions along the equator18.

The advent of molecular phylogenetics has significantly improved our understanding of amphioxus taxonomy and evolutionary history. Analyses based on mitochondrial DNA (e.g., COI, 16 S rRNA), nuclear genes, and complete mitogenomes have revealed substantial cryptic diversity within amphioxus populations. These approaches have facilitated the resolution of previously ambiguous species boundaries and provided new insights into biogeographic patterns and evolutionary trajectories19,20. Notably, comparative mitochondrial studies have identified several genetically distinct lineages within what were previously considered single species, indicating the need for taxonomic revisions17,18. In the West Pacific, for instance, Branchiostoma belcheri has historically been regarded as a widespread species. However, detailed molecular studies have revealed that B. belcheri comprises at least two genetically distinct lineages. Zhong, et al.21 sequenced complete mitochondrial genomes from several B. belcheri populations and found that individuals from Qingdao and Japan, along with B. japonicum from Xiamen, clustered within the same clade, while populations from Maoming and other parts of Xiamen formed a separate lineage. These findings support the notion that B. belcheri and B. japonicum are distinct species, thus refining our understanding of amphioxus diversity and distribution in the region. Amphioxus populations have been recorded extensively across China’s coastal regions. In southeastern China, B. belcheri was first documented in Xiamen Bay nearly a century ago and was once considered a local fishery resource due to its abundance22. In northern China, B. belcheri tsingtauense was described in Qingdao and later reclassified as B. japonicum based on genetic evidence22,23. Amphioxus species have also been recorded in Japan24, Taiwan25, and Australia26, indicating a wide Indo-West Pacific distribution.

In the Pearl River Estuary and surrounding waters, five amphioxus species have been identified to date: three from the genus Branchiostoma (B. belcheri, B. japonicum, and B. malayanum) and two from Epigonichthys (E. cultellus and E. lucayanus). Among them, the three Branchiostoma species are found in relatively high abundance, while the Epigonichthys species are rare and recorded only sporadically27. Remarkably, B. malayanum, previously thought to be restricted to Southeast Asia (e.g., Thailand, Singapore, and the Solomon Islands)9, was recorded for the first time in the South China Sea in 2007, suggesting a broader distribution than previously assumed28. Despite significant advances in amphioxus taxonomy and phylogenetics, many questions remain unresolved. The evolutionary relationships among the three extant genera and within the family Branchiostomatidae are still being refined. In particular, the placement of B. malayanum, a dwarf species that differs morphologically from other congeners, has not been fully elucidated. To address this knowledge gap, comprehensive comparative analyses using complete mitochondrial genomes, combined with morphological data and geographic distribution, are essential.

In this study, we sequenced, assembled, and annotated the complete mitochondrial genome of B. malayanum for the first time. We also compared morphological traits among Branchiostoma populations from the Pearl River Estuary and conducted phylogenetic analyses using complete mitochondrial genomes from 10 of the 30 recognized species within the Branchiostomatidae family. Our specific objectives were to (1) determine the phylogenetic position of B. malayanum and (2) investigate evolutionary relationships across different subfamilies of Branchiostomatidae. This research provides new insights into amphioxus diversity and contributes to a deeper understanding of cephalochordate evolution and biogeography.

Results

Morphological traits

Morphological characteristics of B. malayanum under light microscopy was showed in Fig. 1. The B. malayanum exhibited relatively high individual variation in total length (TL), body height (BH), the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA), the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF), with coefficients of variation (CV) ranging from 24.7 to 63.3% (Fig. 2). Among these traits, the lengths of posterior to the anus (LPA) showed the highest degree of variation, with a CV of 63.3% (Fig. 1). In contrast, the myotomes count anterior to anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), as well as the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal (NFCD, NFCP) exhibited lower variation, with CVs ranging from 13 to 20% (Fig. 2).

Fig. 1
figure 1

Morphological characteristics of Branchiostoma malayanum under light microscopy. (a) Lateral view of a whole adult specimen, showing the slender, laterally compressed body and pointed rostrum. (b) Close-up of the anterior end, highlighting the buccal cirri and oral hood. (c) Magnified view of the midbody region, illustrating the segmented myomeres arranged in a V-shaped pattern. Scale bars: 4 mm (a), 100 μm (b), 500 μm (c).

Fig. 2
figure 2

Intraspecific coefficient of variation for the morphological traits of Branchiostoma malayanum; the number of myotomes anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal regions (NFCD, NFCP), total length (TL), body height (BH), caudal fin height (CFH), rostral fin length (RFL), and rostral fin height (RFH) the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA) the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF).

The results of the Principal Component Analysis (PCA) on the morphological traits of B. malayanum indicated that eight traits—total length (TL), body height (BH), lengths of segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA), lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF ), and caudal fin height (CFH)—had a greater contribution to the first principal component. In contrast, the myotomes count anterior to anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal (NFCD, NFCP) had a greater contribution to the second principal component (Fig. 3). The allometric growth indices of total length (TL) with segments lengths anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF), and body height (BH) in the B. malayanum were all close to 1 (Fig. 3). As the total length of the B. malayanum increases, these traits exhibited isometric growth. The allometric growth index of total length (TL) with the total number of myotomes (TNM) was significantly less than 1, indicating an allometric relationship between these two traits (Fig. 4). As the total length of the B. malayanum increased, the increase in myotomes number is relatively small.

Fig. 3
figure 3

Biplot of principal component analysis (PCA) for the morphological traits parameters of Branchiostoma malayanum; the number of myotomes anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal regions (NFCD, NFCP), total length (TL), body height (BH), caudal fin height (CFH), rostral fin length (RFL), and rostral fin height (RFH) the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA) the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF).

Fig. 4
figure 4

Allometric growth relationships between body length and other morphological traits of Branchiostoma malayanum; the number of myotomes anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal regions (NFCD, NFCP), total length (TL), body height (BH), caudal fin height (CFH), rostral fin length (RFL), and rostral fin height (RFH) the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA) the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF).

Mitogenome structure

The mitochondrial genome of B. malayanum is a closed circular DNA molecule with a total length of 15,154 base pairs (GenBank accession number: PV546308). The overall nucleotide composition is biased towards adenine and thymine, with proportions of A (26.32%), T (37.15%), G (22.59%), and C (13.94%), resulting in a pronounced AT content of 63.47%. The gene arrangement and composition are highly conserved, consistent with the typical organization observed in vertebrate mitochondrial genomes. The genome comprises 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, and 2 ribosomal RNA (rRNA) genes (12 S rRNA and 16 S rRNA). The PCGs include cytochrome b (Cytb), two subunits of ATP synthase (ATP6 and ATP8), three subunits of cytochrome c oxidase (COX1, COX2, and COX3), and seven subunits of NADH dehydrogenase (ND1, ND2, ND3, ND4, ND4L, ND5, and ND6). The majority of mitochondrial genes are encoded on the heavy strand (H-strand), including 15 tRNA genes (tRNA-Pro, tRNA-Phe, tRNA-Val, tRNA-Leu, tRNA-Ile, tRNA-Met, tRNA-Trp, tRNA-Ser, tRNA-Asp, tRNA-Lys, tRNA-Arg, tRNA-His, tRNA-Ser, tRNA-Leu, and tRNA-Gly), 12 PCGs (ND1, ND2, ND3, ND4, ND4L, ND5, COX1, COX2, COX3, ATP6, ATP8, and Cytb), and both rRNA genes (12 S rRNA and 16 S rRNA). The remaining genes, including ND6 and several tRNAs, are transcribed from the light strand (L-strand) (Fig. 5). The ribosomal RNA genes are 1,384 bp (16 S rRNA) and 845 bp (12 S rRNA) in length, respectively. Almost all 22 tRNA genes exhibit the canonical cloverleaf secondary structure, with individual gene lengths ranging from 59 to 71 base pairs. The mitochondrial genome of B. malayanum exhibits a negative A/T skew of − 0.1706 and a positive G/C skew of 0.2368, indicating a higher proportion of thymine over adenine and a higher proportion of guanine relative to cytosine.

Fig. 5
figure 5

Circular map of the complete mitochondrial genome of Branchiostoma malayanum. The figure displays the annotated mitochondrial genome structure, including 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, and 2 ribosomal RNA (rRNA) genes. Gene names are indicated along the outer circle, with genes encoded on the majority strand shown outside the circle and those on the minority strand inside. Colors represent different gene types: protein-coding genes (blue), tRNA genes (lavender), and rRNA genes (virescent). The inner black circle represents the distribution of GC content across the mitochondrial genome, with outward projections indicating regions of above-average GC content and inward projections reflecting below-average levels. GC skew is illustrated using a color-coded system: green denotes negative GC skew, while deep purple indicates positive GC skew. The map was generated using the CGView Server.

The protein-coding genes (PCGs) of the B. malayanum mitochondrial genome span a total of 11,291 base pairs, accounting for 74.02% of the entire mitogenome. This region exhibits a marked AT bias, with an A + T content of 62.89%, and encodes a total of 3,752 amino acid residues. Among the 13 PCGs, ND6 is the only gene encoded on the light strand (L-strand), while the remaining genes are located on the heavy strand (H-strand). The longest PCG is ND5, comprising 1,797 bp and encoding 598 amino acids, whereas the shortest is ATP8, with a length of 165 bp, encoding 54 amino acids. Initiation codons are predominantly ATG across the genome; however, two genes (COX1 and ND1) initiate with the alternative start codon GTG. All PCGs terminate with complete stop codons: eight genes (Cytb, COX2, ATP6, COX3, ND3, ND4L, ND5, and ND6) end with TAA, and five genes (ND1, ND2, COX1, ATP8, and ND4) terminate with TAG (Table 1). The mitogenome encoded 22 tRNA genes ranging in length from 63 to 71 base pairs, collectively responsible for transporting 19 different amino acids. These tRNA genes were transcribed from both the heavy (H) and light (L) strands: seven tRNAs—tRNA-Thr, tRNA-Gln, tRNA-Asn, tRNA-Ala, tRNA-Cys, tRNA-Tyr, and tRNA-Glu—were encoded on the L-strand, while the remaining tRNAs were located on the H-strand. With the exception of tRNA-Ser, which lacks the dihydrouridine (DHU) stem, all other 21 tRNAs were predicted to fold into the typical cloverleaf secondary structure (Fig. 6). To investigate codon usage patterns and amino acid distribution, the amino acid composition and relative synonymous codon usage (RSCU) of the 13 protein-coding genes were analyzed in the mitochondrial genome of B. malayanum. This analysis offers insights into codon bias and the potential functional significance of amino acid composition in this species. The RSCU results indicated a preferential use of codons encoding leucine (Leu), isoleucine (Ile), phenylalanine (Phe), methionine (Met), valine (Val), and glycine (Gly), whereas codons corresponding to cysteine (Cys) were among the least frequently used (Fig. 7).

Table 1 Characteristic constituents of the mitochondrial genome of Branchiostoma malayanum.
Fig. 6
figure 6

Second structures of transfer RNAs in mitochondrial genome of Branchiostoma malayanum. tRNA secondary structures were predicted and visualized using tRNAscan-SE version 1.21 and manually edited in Adobe Illustrator for clarity.

Fig. 7
figure 7

Relative synonymous codon usage (RSCU) patterns in mitochondrial genome of Branchiostoma malayanum.

Mitochondrial genetic divergence among amphioxus genera

Genetic divergence values among the three amphioxus genera varied depending on the mitochondrial marker used (Fig. 8 and Table S2). For the COX1 gene, intra-generic divergence ranged from 0.193 ± 0.007 in Branchiostoma to 0.207 ± 0.008 in Asymmetron, while inter-generic comparisons showed the highest divergence between Asymmetron and Epigonichthys (0.271 ± 0.010) and the lowest between Branchiostoma and Epigonichthys (0.230 ± 0.007). In the cytochrome b gene, genetic divergence increased, with intra-generic values reaching up to 0.293 ± 0.018 in Epigonichthys. Inter-generic distances were highest between Asymmetron and Branchiostoma (0.359 ± 0.014) and lowest between Asymmetron and Epigonichthys (0.345 ± 0.014), indicating greater variability in this locus. For the ribosomal markers, 12 S rRNA and 16 S rRNA, a similar trend of increased inter-generic divergence was observed. The 12 S rRNA gene showed the highest divergence between Asymmetron and Epigonichthys (0.386 ± 0.021) and the lowest intra-generic divergence in Asymmetron (0.215 ± 0.015). The 16 S rRNA gene also revealed pronounced inter-generic divergence, with values ranging from 0.324 ± 0.014 (Branchiostoma vs. Asymmetron) to 0.365 ± 0.016 (Epigonichthys vs. Asymmetron). Overall, the results indicate consistently high inter-generic divergence across all four mitochondrial markers, supporting clear genetic differentiation among the three amphioxus genera. The higher divergence values in ribosomal genes compared to protein-coding genes also suggest differing evolutionary rates and potential utility in resolving deeper phylogenetic relationships.

Fig. 8
figure 8

Heatmaps of pairwise genetic distances among three extant amphioxus genera (Asymmetron, Branchiostoma, and Epigonichthys) based on mitochondrial markers. Genetic divergence was estimated using the Kimura two-parameter (K2P) model, incorporating both intra- and inter-generic comparisons. Four mitochondrial loci were analyzed: two protein-coding genes (COX1 and Cytb) and two ribosomal RNA genes (12 S rRNA and 16 S rRNA). Each heatmap displays median pairwise genetic distances, with color gradients reflecting increasing divergence from low (light yellow) to high (dark blue). Diagonal values represent intra-generic distances, while off-diagonal values represent inter-generic divergence. Pairwise genetic distances were estimated using the Kimura two-parameter model with 10,000 nonparametric bootstrap replicates to calculate standard errors of the distance estimates. Standard errors of the distance estimates were calculated based on the bootstrap resampling framework to ensure the statistical robustness of the analysis. Standard errors were calculated but are not displayed due to small magnitude (< 0.01).

Phylogenetic relationships within Branchiostomatidae based on mitochondrial genomes

Phylogenetic analyses based on complete mitochondrial genome sequences yielded a robust and well-resolved topology, consistently supporting the monophyly of the three recognized genera within the family Branchiostomatidae: Branchiostoma, Epigonichthys, and Asymmetron (Fig. 9). Both Maximum Likelihood (ML) and Bayesian Inference (BI) methods recovered congruent tree topologies, with strong statistical support across major nodes. Ciona intestinalis was used as the outgroup to root the tree. The genus Branchiostoma formed a distinct and well-supported clade (ML bootstrap support/posterior probability: 100/0.99), comprising B. belcheri, B. japonicum, B. malayanum, B. floridae, and B. lanceolatum. Within this clade, B. belcheri and B. japonicum were resolved as sister taxa with maximal support (100/0.91), and B. malayanum clustered closely with this pair, forming a subclade also supported by high nodal values (98/0.85). B. floridae and B. lanceolatum grouped together with moderate-to-high support (85/0.96), forming a early-diverging lineage within Branchiostoma. The genus Epigonichthys was recovered as monophyletic (ML/BI: 96/0.85), comprising E. cultellus and E. maldivensis, which formed a strongly supported sister-group relationship. Similarly, Asymmetron species, including A. inferum, A. lucayanum, and an undescribed Asymmetron sp., formed a well-supported clade (99/1), with A. lucayanum and Asymmetron sp. exhibiting closer affinity (96/0.92), and A. inferum occupying an early branching position (89/0.90). The intergeneric relationships revealed that Asymmetron diverged earliest among the three genera, followed by Epigonichthys, with Branchiostoma forming the most derived clade. This topology is consistent with previous findings suggesting the early-diverging position of Asymmetron within the family Branchiostomatidae. Overall, the phylogenetic reconstruction demonstrates clear genetic delineation among amphioxus genera and provides additional support for the distinct evolutionary trajectories of major lineages within cephalochordates.

Fig. 9
figure 9

Maximum likelihood (ML) and Bayesian inference (BI) phylogenetic tree illustrating evolutionary relationships among amphioxus species based on complete mitochondrial genome sequences. The tree is rooted with Ciona intestinalis as the outgroup. Support values at each node are presented as ML bootstrap proportions (left) and BI posterior probabilities (right). Three well-supported clades corresponding to the genera Branchiostoma, Epigonichthys, and Asymmetron are recovered, each forming monophyletic lineages. Notably, Branchiostoma malayanum clusters closely with B. japonicum and B. belcheri, while Asymmetron lucayanum, A. inferum, and Asymmetron sp. form a strongly supported subgroup. The deeper branching of Asymmetron relative to the other two genera highlights its early divergence within the family Branchiostomatidae. The scale bar indicates the number of substitutions per site.

Discussion

Mitochondrial DNA (mtDNA) in metazoans serves as a powerful model for investigating evolutionary genomics due to its relatively simple structure, maternal inheritance, and rapid evolutionary rate. The emergence of high-throughput sequencing technologies, coupled with advances in bioinformatics, has significantly enhanced the accessibility and utility of complete mitochondrial genomes. As a result, mtDNA is now extensively applied in taxonomic identification, phylogenetic reconstruction, and the study of evolutionary relationships across a wide range of freshwater and marine organisms29,30,31,32,33. The present study represents the first comprehensive morphological and mitogenomic analysis of B. malayanum within the context of the family Branchiostomatidae, integrating morphological characterization with complete mitochondrial genome sequencing and phylogenetic reconstruction. Our findings contribute significantly to understanding amphioxus systematics and mitochondrial genome evolution, and provide a clarified phylogenetic position for B. malayanum, which had previously received limited genomic attention despite its biogeographic importance in the Indo-West Pacific region. This discussion interprets our results within the broader framework of cephalochordate evolutionary biology, evaluates the implications for Branchiostomatidae taxonomy, and suggests directions for future research.

Morphological observations confirmed that B. malayanum exhibits key diagnostic features characteristic of the genus Branchiostoma, including a persistent notochord, dorsal nerve cord, V-shaped myomeres, and numerous pharyngeal slits—fundamental traits retained from the last common ancestor of chordates. These features underscore its classification within Cephalochordata and reflect the morphological conservatism long recognized in amphioxus8,9. However, our detailed morphological characterization of B. malayanum also revealed subtle but potentially significant variation in traits such as body size, pigmentation, and myomere patterning. While these characteristics broadly align with prior descriptions, they also suggest possible geographic or ecological influences, particularly in the context of the South China Sea. Intraspecific variation was most pronounced in traits associated with total body length and posterior body morphology. The coefficient of variation (CV) was highest for the length of the body segment posterior to the anus (LPA), indicating considerable variability in posterior development among individuals. In contrast, features such as myotome number and fin chamber count displayed low variability, suggesting that these segmental and structural traits are developmentally constrained and more conserved across individuals. Principal component analysis (PCA) and allometric scaling further revealed that while many external morphological traits scale proportionally with body size, others, such as myotome number, exhibit allometric patterns, possibly due to intrinsic developmental limitations. Notably, some traits in B. malayanum, such as notochord length-to-body-length ratio, ventral fin shape, and pigment stripe pattern, differed slightly from those of closely related species like B. belcheri and B. japonicum. Although often overlooked due to presumed intraspecific plasticity, these features may still contain phylogenetically informative signals, especially when interpreted alongside molecular data23,28. For example, minor differences in ventral fin positioning or pigmentation may reflect habitat-specific adaptations, such as sediment type or burrowing behavior in the Pearl River Estuary—an interpretation consistent with earlier ecological studies24. Taken together, our findings highlight both the utility and the limitations of morphology in amphioxus taxonomy. While traditional morphological markers remain essential for species identification, their resolution may be compromised by phenotypic plasticity and allometric effects. Therefore, an integrative taxonomic framework that combines morphological and molecular data is essential for resolving species boundaries and understanding evolutionary divergence within Branchiostoma.

The complete mitochondrial genome of B. malayanum exhibits structural features consistent with other members of Branchiostomatidae. The genome, spanning 15,154 bp, contains the standard 13 protein-coding genes (PCGs), 2 ribosomal RNAs (rRNAs), and 22 transfer RNAs (tRNAs), maintaining the gene order and orientation commonly observed across amphioxus mitogenomes. This conservation reflects strong purifying selection acting on mitochondrial genome organization in cephalochordates, likely due to the essential role of mitochondrial-encoded proteins in oxidative phosphorylation and energy metabolism33,34. The pronounced AT bias (63.46%) observed in the B. malayanum mitochondrial genome aligns with compositional trends documented in other chordates and is hypothesized to reflect thermodynamic constraints during replication and transcription. Notably, the strand-specific asymmetry (negative AT skew, positive GC skew) seen in B. malayanum matches the bias directionality reported in other amphioxus species (e.g., B. floridae, B. japonicum), suggesting evolutionary conservation of mitochondrial replication mechanisms35,36,37. An important feature revealed in our analysis is the codon usage pattern in the 13 PCGs. The predominance of codons encoding Leu, Val, Gly, and Ala—particularly TTA (Leu) and GTT (Val)—suggests translational optimization aligned with tRNA availability. Meanwhile, the underrepresentation of Cys-encoding codons points to functional constraints, possibly related to protein folding dynamics in mitochondrial enzymes. Codon usage bias, as quantified through RSCU, may also be influenced by mutational pressure and selection for efficiency and accuracy in protein synthesis38,39. Furthermore, the consistent use of ATG as an initiation codon, with rare exceptions (e.g., GTG for COX1 and ND1), and complete stop codons (TAA, TAG) across all PCGs reinforce the functional integrity of B. malayanum mitochondrial translation mechanisms. The presence of canonical cloverleaf structures in most tRNAs, except for a non-standard DHU arm in tRNA-Ser, suggests a balance between structural conservation and flexibility. These secondary structures are vital for efficient aminoacylation and codon recognition, underlining their evolutionary importance.

A core component of this study was the comparative analysis of mitochondrial genetic divergence across representative amphioxus genera. Using COX1, Cytb, 12 S rRNA, and 16 S rRNA markers, we quantified pairwise genetic distances to elucidate inter- and intra-generic variation. The results demonstrated that intrageneric divergence was consistently lowest within Branchiostoma across all loci (e.g., COX1: 0.193 ± 0.007; Cytb: 0.245 ± 0.010), followed by Epigonichthys and Asymmetron. In contrast, intergeneric comparisons showed substantially greater divergence. For instance, COX1 divergence between Asymmetron and Epigonichthys reached 0.271 ± 0.010, while divergence between Branchiostoma and Epigonichthys was 0.230 ± 0.007. The Cytb gene displayed even higher divergence values (e.g., 0.359 ± 0.014 between Asymmetron and Branchiostoma), indicating faster evolutionary rates in protein-coding loci relative to ribosomal genes17. Ribosomal RNA markers (12 S and 16 S rRNA) revealed similar patterns of divergence, albeit with slightly lower rates, consistent with their conserved functional roles. The highest intergeneric divergence observed in 12 S and 16 S rRNA (e.g., 0.386 ± 0.021 and 0.365 ± 0.016, respectively) further supports deep evolutionary separation among amphioxus genera. These values are well above typical thresholds for species-level delimitation, suggesting long-standing reproductive isolation and independent evolutionary trajectories19. The pairwise genetic distances, calculated using the Kimura two-parameter model, indicate that B. malayanum is a distinct evolutionary lineage, confirming its species-level status within the genus. These distances fall within the range of recognized interspecific divergence among amphioxus18,21, supporting earlier reports that suggested the presence of cryptic species or geographically structured lineages within Branchiostomatidae. It is particularly noteworthy that the divergence between B. malayanum and its closest relatives (e.g., B. belcheri, B. lanceolatum) was comparable to or exceeded that observed among other well-established species pairs. This result reinforces the importance of mitochondrial data in amphioxus taxonomy, particularly in contexts where morphological markers are insufficiently diagnostic. Moreover, our comparative mitogenomic framework allows for the detection of lineage-specific molecular signatures that may underpin ecological or behavioral adaptations, such as differential salinity tolerance or burrowing depth28.

Phylogenetic analyses based on complete mitochondrial genome sequences, conducted using both Maximum Likelihood (ML) and Bayesian Inference (BI) methods, yielded congruent and well-supported topologies that robustly resolve the evolutionary relationships within the family Branchiostomatidae. All analyses consistently recovered three strongly supported monophyletic clades corresponding to the genera Branchiostoma, Epigonichthys, and Asymmetron, with Ciona intestinalis designated as the outgroup. This phylogenetic structure aligns closely with prior mitogenomic and multi-locus studies and reaffirms the tripartite division of extant cephalochordates17,40,41. Within the Branchiostoma clade, B. malayanum was consistently placed in a well-supported lineage alongside B. belcheri and B. japonicum, forming a distinct subgroup within the genus. High bootstrap (98–100%) and posterior probability values (> 0.99) support the close phylogenetic affinity among these species, suggesting a relatively recent common ancestry28. Despite this proximity, genetic distance analyses revealed sufficient divergence to warrant their recognition as separate species, with evidence for reproductive isolation and independent evolutionary trajectories. The clustering of these species likely reflects a shared biogeographic history in the Indo-West Pacific, where they exhibit overlapping but ecologically differentiated distributions. In contrast, B. floridae and B. lanceolatum—found in the Atlantic and Mediterranean regions—formed a separate sister group, suggesting an earlier divergence potentially shaped by historical geographic isolation or ecological specialization20. The phylogenetic position of B. malayanum also provides insight into the internal diversification of Branchiostoma. While previous hypotheses proposed that B. malayanum might represent a early-diverging or transitional lineage, the current mitogenome-wide analysis places it squarely within the core of Branchiostoma, refuting notions of ambiguous taxonomic status based on earlier morphological data. Its deep divergence from B. belcheri and B. japonicum, despite morphological similarities, underscores its evolutionary distinctiveness and highlights the importance of molecular data in resolving cryptic speciation events in amphioxus. Beyond Branchiostoma, the genera Asymmetron and Epigonichthys were recovered as monophyletic sister clades, forming an early-diverging lineage relative to Branchiostoma. This topology is consistent with prior phylogenomic findings and supports the hypothesis that Asymmetron represents the earliest diverging amphioxus lineage. The early-diverging positioning of Asymmetron is further corroborated by its retention of plesiomorphic traits, such as delayed metamorphosis, reduced notochord stiffness, and distinct mitochondrial gene arrangements18. Epigonichthys—comprising E. cultellus and E. maldivensis—formed a well-supported clade and appeared as a sister taxon to Branchiostoma, reinforcing the evolutionary coherence of these three genera. While most internal nodes were supported by high bootstrap and posterior probability values, a few within the AsymmetronEpigonichthys grouping showed moderate support (e.g., bootstrap ~ 89%, posterior probability ~ 0.85), potentially reflecting incomplete lineage sorting or mutational saturation in mitochondrial markers. The evolutionary history inferred from these phylogenetic patterns suggests that cephalochordates underwent an early divergence into three major lineages, followed by radiation within each genus. The early-diverging positions of Asymmetron and Epigonichthys indicate ancient lineages that have retained key ancestral features, while Branchiostoma appears to represent a more recently diversified clade. Within Branchiostoma, the phylogenetic structure implies that the Indo-West Pacific taxa (B. belcheri, B. japonicum, and B. malayanum) arose from more recent divergence events, potentially driven by historical biogeographic barriers, ecological gradients, or localized adaptation. Importantly, the evolutionary distinctiveness of B. malayanum is further supported by its relatively low intraspecific variability in mitochondrial sequences, suggesting a phase of evolutionary stasis or the action of stabilizing selection. At the same time, its clear separation from congeners points to a long-standing, independently evolving lineage. These findings not only refine our understanding of B. malayanum’s taxonomic placement but also contribute to a broader understanding of cephalochordate evolution, particularly the dynamics of divergence, speciation, and lineage sorting in early chordates20. Our results demonstrate that mitochondrial genomes provide robust phylogenetic resolution among amphioxus taxa and support the continued use of mitogenomic data in resolving deep evolutionary relationships in early-diverging chordates. The integrative framework combining molecular, morphological, and ecological data holds significant promise for clarifying the evolutionary history of amphioxus and illuminating the origins of vertebrate traits.

The discovery and complete mitochondrial genome sequencing of B. malayanum from the Pearl River Estuary significantly extend the known geographic range of this species, which was previously reported only from Southeast Asia. Its confirmed presence in the South China Sea suggests a broader ecological amplitude than previously recognized and raises the possibility that historical dispersal or vicariance events, likely driven by sea-level fluctuations and plate tectonic activity in the Indo-Pacific, contributed to its current distribution. The unique environmental conditions of the Pearl River Estuary—characterized by fluctuating salinity, high sedimentation, and anthropogenic disturbance—present potential selective pressures that may have shaped local adaptation in B. malayanum. The species’ persistence in this estuarine habitat implies underlying physiological or behavioral adaptations that remain to be elucidated. Future research should also aim to expand mitogenomic sampling across Branchiostomatidae, especially in understudied regions such as the eastern Indian Ocean, the Red Sea, and the western Pacific islands. Population-level mitogenomic studies of B. malayanum across multiple estuarine systems in Southeast Asia would offer valuable insights into gene flow, historical demography, and potential local adaptation, and could reveal cryptic population structure or incipient speciation. Such research efforts are not only important for evolutionary biology but are also critical for conservation. Amphioxus habitats, particularly in estuarine and coastal zones, are increasingly threatened by habitat degradation, pollution, and climate change. Understanding the evolutionary distinctiveness and adaptive capacity of species like B. malayanum is essential for developing informed conservation strategies and preserving the phylogenetic diversity of early-diverging chordates.

Conclusion

This study provides the first complete mitochondrial genome and a detailed comparative analysis of B. malayanum, offering new insights into its taxonomy, mitogenomic structure, and evolutionary relationships within Branchiostomatidae. The mitogenome of B. malayanum displays conserved features typical of cephalochordates, including gene order, strand asymmetry, and codon usage patterns, reflecting strong evolutionary constraints on mitochondrial function. Morphological examination revealed a combination of conserved and variable traits, with developmental allometry affecting external body proportions but not segmental counts. These results emphasize the limitations of morphology alone in amphioxus taxonomy and support the use of integrative approaches. Comparative genetic analyses based on COX1, Cytb, 12 S rRNA, and 16 S rRNA confirmed high intergeneric divergence among Branchiostoma, Asymmetron, and Epigonichthys, and supported the species-level distinctiveness of B. malayanum within its genus. Phylogenetic reconstructions consistently positioned B. malayanum alongside B. belcheri and B. japonicum, forming a well-supported clade distinct from other Branchiostoma lineages. The evolutionary history inferred from mitochondrial data suggests that B. malayanum has undergone independent divergence, likely shaped by its regional distribution in the South China Sea. These findings underscore the importance of mitogenomic data in resolving cryptic diversity and reconstructing lineage relationships in early-diverging chordates. In sum, this study advances our understanding of amphioxus evolution and provides a valuable genomic resource for future phylogenetic and comparative genomic studies in cephalochordates.

Materials and methods

Sample collection

Specimens of B. malayanum were collected from the Pearl River Estuary in the South China Sea (22° 32′ N, 114° 28′ E) during a field survey conducted from May 5 to 7, 2024. Bottom sediment was collected from a 25 cm × 25 cm quadrat underwater by divers using a self-fabricated 316 stainless steel sampling tool and transferred into a nylon mesh bag (mesh size: 0.5 mm). Onshore, the sediment was washed through a 0.18 mm mesh sieve, and Branchiostomatidae specimens were hand-picked and immediately preserved in 95% ethanol. During transport, all samples were temporarily stored in an ice-filled cold box. Upon arrival at the Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, Branchiostomatidae samples were refrigerated at 4 °C, while sediment samples were frozen at − 20 °C for subsequent analyses.

Morphological examinations

The morphological traits of B. malayanum were categorized into two groups: counting traits and measuring traits. Measurement of counting traits: Individuals of B. malayanum were placed under a dissecting microscope equipped with a light source. The following traits were visually counted: the number of myotomes anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), as well as the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal regions (NFCD, NFCP). Measurement of measuring traits: B. malayanum individuals were positioned flat in a petri dish. Their total length (TL), body height (BH), caudal fin height (CFH), rostral fin length (RFL), and rostral fin height (RFH) were measured using a vernier caliper. Additionally, under a dissecting microscope with a light source and an ocular micrometer, the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA) were measured, along with the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF).

The morphological traits of the B. malayanum, including total length (TL), body height (BH), rostral fin length (RFL), rostral fin height (RFH), the lengths of the segments anterior to the ventral pore (LSAVP), from the ventral pore to the anus (LVPA), and posterior to the anus (LPA), the lengths of the upper and lower lobes of the caudal fin (LUCF and LLCF), caudal fin height (CFH), the number of myotomes anterior to the ventral pore (NMAVP), between the ventral pore and the anus (NMVPA), and posterior to the anus (NMPA), as well as the total number of myotomes (TNM) and the number of fin chambers in the dorsal and preanal (NFCD, NFCP) were log-transformed to approximate a normal distribution. The mean values and standard deviations of these morphological traits were calculated. The coefficient of variation (CV) was used to quantify the spatial variation of the mean values of each trait, with the formula: Coefficient of Variation (CV) = (Standard Deviation / Mean) × 100%. Principal Component Analysis (PCA) was performed on the mean values of the morphological traits of the B. malayanum. Allometric relationships were used to describe the relative growth rates of different traits in B. malayanum. The allometric equation Y = aXb was employed to fit the allometric relationships between total length (TL) and other morphological traits. By taking the logarithm of both sides of the equation, we obtained ln(Y) = ln(a) + b ln(X), where ln(a) is the allometric constant and b is the allometric coefficient. When b = 1.0, the dependent variable (y) and the independent variable (x) exhibit isometric growth. When b > 1.0 or b < 1.0, an allometric relationship is indicated.

DNA extraction, sequencing and annotation

Genomic DNA was extracted from the muscle tissue of a B. malayanum specimen using the TIANamp Marine Animals DNA Kit (TIANGEN, China). The specimen and the extracted DNA are deposited at the Guangdong Provincial Key Laboratory of Fishery Ecology and Environment (https://www.southchinafish.ac.cn/english.htm, Lei Xu, xulei@scsfri.ac.cn) under the voucher number SCS2024-S07-140. The sequencing library was prepared using the Illumina TruSeq™ Nano DNA Sample Preparation Kit (Illumina, San Diego, USA) following the manufacturer’s protocol. The library was loaded onto the Illumina NovaSeq 6000 platform for paired-end (PE) 2 × 150 bp sequencing at Jierui Biotech (Guangzhou, China). Raw reads were checked using FastQC v.0.12.0 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Quality trimming and filtering were conducted using FASTP version 0.23.442. Reads with more than 5% unknown nucleotides, low-quality reads (more than 50% of bases with a Q-value ≤ 20), and all unpaired reads were removed. The filtered data were assembled using MitoFinder version 1.4.143 with default settings, utilizing the uploaded COX1 sequence (GenBank: AB248231) as a reference. The mitochondrial genome was annotated using the MITOS WebServer and subsequently verified through BLAST analysis44. Transfer RNA (tRNA) genes were identified using tRNAScan-SE version 1.21 on its web-based platform, with default parameters and the genetic code set to “Mito/chloroplast” to reflect the mitochondrial origin of the sequences45. The mitochondrial genome of the species was visualized based on assembly and annotation data, with a graphical map generated using the GCView Server to clearly represent the genomic features46. Nucleotide composition skewness was assessed using the formulas: \(\:\text{A}/\text{T}-\text{s}\text{k}\text{e}\text{w}=\frac{\text{A}-T}{\text{A}+T}\) and \(\:\text{G}/\text{C}-\text{s}\text{k}\text{e}\text{w}=\frac{\text{G}-C}{\text{G}+C}\) 36. Relative synonymous codon usage (RSCU) for the 13 protein-coding genes (PCGs) was calculated using MEGA version 6.0 47. This analysis offers valuable insights into codon usage bias, which is critical for understanding the genetic coding strategies and evolutionary dynamics of the mitochondrial genome. To quantify genetic divergence among the three extant amphioxus genera—Asymmetron, Branchiostoma, and Epigonichthys—we calculated pairwise evolutionary distances using the Kimura two-parameter (K2P) model. Median percentage values were estimated for both intra- and inter-generic comparisons. The analysis was based on two mitochondrial protein-coding genes, COX1 (cytochrome c oxidase subunit I) and Cytb (cytochrome b), along with two mitochondrial ribosomal RNA genes, 12 S rRNA and 16 S rRNA. Molecular evolutionary rates were calibrated under a strict molecular clock assumption. Pairwise distance analyses, rather than tree-based phylogenetic reconstruction, were performed to evaluate genetic differentiation using the Kimura two-parameter model in MEGA version 6.0 47, with branch support evaluated through 10,000 nonparametric bootstrap replicates. Standard errors of the K2P distances were computed based on this resampling framework to ensure the statistical robustness of the divergence estimates.

Phylogenetic analysis

Protein-coding genes were aligned using MAFFT version 7 implemented in PhyloSuite under default parameters with codon alignment mode48. A total of ten mitochondrial genomes from species within the family Branchiostomatidae were retrieved from the NCBI database and analyzed to infer phylogenetic relationships, with Ciona intestinalis (accession number AM292218) designated as the outgroup (Table S1). Phylogenetic analyses were performed using PhyloSuite version 1.2.149 incorporating both Maximum Likelihood (ML) and Bayesian Inference (BI) approaches. ML phylogenies were reconstructed with IQ-TREE50 under an edge-linked partition model, utilizing 5000 ultrafast bootstrap replicates and the Shimodaira–Hasegawa-like approximate likelihood ratio test (SH-aLRT) to evaluate branch support51. BI analyses were conducted with MrBayes version 3.2.652 under a partitioned model framework, consisting of two parallel runs for 10,000 generations. The first 25% of sampled trees were discarded as burn-in to ensure convergence and reliability of the posterior probability estimates.