Introduction

The shorthead drum, Larimus breviceps Cuvier, 1830 is a member of the family Sciaenidae, widely distributed in the western Atlantic from the Antilles and Costa Rica to southern Brazil1,2. This species inhabits coastal waters and estuaries, at depths of up to 60 m on muddy or sandy bottoms3. Larimus breviceps is a small-sized Sciaenidae with total length reaching 32.5 cm4, classified as a marine migrant species, where adults live in the coastal area and uses estuaries for feeding and spawning5,6.

The shorthead drum is a fishery resource caught by artisanal fisheries, mainly as bycatch in Brazilian shrimp fisheries, and a recent study indicated slight overexploitation and overfishing in Northeast Brazil7. However, despite its ecological and economic importance, understanding of the genetic aspects of the species is limited, except for its inclusion in a DNA barcoding study8, and in a phylogeny of the western South Atlantic Sciaenidae9. Both studies have detected two groups with a high degree of genetic divergence, 7,6% for COI in Ribeiro et al.8, and 2.8% for 16S rDNA, 9.2% for COI, and 1% for Tmo-4C4 in Santos et al.9, consistent with species-level differentiation. However, the literature only records L. breviceps in the western South Atlantic1,2, and the biological significance of this genetic divergence remains to be explored. Moreover, no studies have assessed whether these genetic differences suggest the existence of morphological variation between groups of Larimus. In the family Sciaenidae, morphometric studies have detected subtle morphological differences in cryptic species of Macrodon, which inhabits the same area of Larimus10. Thus, geometric morphometrics analysis has become a widespread approach for comparing cryptic species and defining stock units11, particularly for taxa with some degree of threat12 or those of significant economic or social importance10. Therefore, is necessary a systematic study to evaluate the presence of cryptic species of Larimus in the western South Atlantic.

Phylogeographic studies of taxa from the western South Atlantic suggest that climatic fluctuations or physical barriers have driven allopatric speciation10,13,14,15,16. Also, ecological speciation has been suggested to explain divergence with gene flow in marine and estuarine species in the area14,17,18,19.

The western South Atlantic has been influenced by distinct patterns of marine circulation20,21,22, which have created thermal gradients that may have promoted genetic differentiation in the populations of species that inhabit this area. Another relevant factor that may influence population structure in widely distributed species dependent on specific environments, is the habitat discontinuities produced by the variation in the geomorphological conditions of the continental shelf. Areas with predominantly muddy bottoms are found primarily in the vicinity of major rivers, where sediments accumulate from the fluvial discharge, mainly on the northern and southern coasts of Brazil, separated by areas of predominantly rocky bottoms with coral reefs on the northeastern coast of Brazil21. Glaciations have also shaped the genetic structure of population of marine species, by changing ocean circulation patterns, substantially reducing the sea surface temperature and sea level (which decreased by up to 140 m) and causing profound changes in the availability of habitat for marine and estuarine species23,24.

Given its wide distribution, L. breviceps inhabits highly diverse areas with marked differences in circulation patterns, salinity, and sea surface temperatures14,21,22. Additionally, distinct geomorphological conditions of the continental shelf, along with the mouths of large rivers (Amazon, São Francisco and Doce rivers), which discharge freshwater and sediments in some areas, further contribute to these differences14,25. Furthermore, its preference for muddy bottoms associated with the discontinuities of this feature in the western South Atlantic may promote divergent selection, causing population differentiation.

Therefore, considering that the species inhabits areas with marked environmental differences, the present study aimed to evaluate the hypothesis of L. breviceps speciation in the western South Atlantic, sampling populations from Pará on the northern coast of Brazil to Santa Catarina at the southern extreme of the species range on the Brazilian coast, using phylogenetic and population genetic analyses based on a multilocus approach with the mitochondrial (Cytochrome c Oxidase subunit I - COI, the Control Region - CR, and Cytochrome b – Cyt b) and nuclear (Tmo-4C4 and Insulin-like Growth Factor 1 – IGF1) markers. A quantitative morphometric analysis of body shape was also included to assess whether the potential genetic differentiation is reflected in morphological differences between groups.

Results

Phylogenetic analysis, TMRCA estimation, and sequence divergence

For the construction of the Bayesian gene trees (BI) and species tree for the time of the most recent common ancestor (TMRCA) estimates we used an aligned dataset of 3062 bps, including indels, from 79 L. breviceps individuals, composed of fragments of CR (824 bps), COI (624 bps), Cyt b (786 bps), IGF1 (363 bps), and Tmo-4C4 (465 bps), with a total of 495 variable sites. The sequences generated in this study were deposited in GenBank under accession numbers PQ094945 - PQ095298 (for CR), PQ060262-PQ060358 (for COI), PQ095299 - PQ095395 (for Cyt b), PQ095396 - PQ095490 (for Tmo-4C4), and PQ095491 - PQ095568 (for IGF1).

All the trees were concordant and revealed two monophyletic lineages, which were designated as lineages I and II, with high posterior probabilities (Fig. 1; Supplementary Figs. 1–5). These groups coexist within the study area, except in Pará, where all specimens were assigned to lineage I, and in Paraíba, where all individuals were assigned to lineage II. As the analyses produced trees with the same arrangement, only the species tree is provided (Fig. 1). It is noteworthy that, in all analyses, except those based on nuclear markers, lineage II is closer to L. pacificus, from the eastern Pacific, than to lineage I (Fig. 1; Supplementary Figs. 1–5).

Fig. 1
figure 1

Species tree based on Bayesian inference of the nuclear (IGF1 and Tmo-4C4) and mitochondrial data (COI, Cyt b, and CR). The numbers below the nodes indicate the posterior probability, while those above the nodes show the TMRCA and credibility intervals. Node bars depict 95% highest posterior density. The species Sciaenops ocellatus and Nebris microps were used as outgroups.

The TMRCA shows that the separation between Larimus breviceps LI from the clade L. pacificus and L. breviceps LII dates from 12.3 (HPD: 11.1–14.5) Ma, in the middle Miocene, whereas L. breviceps LII split from the L. pacificus at 3.4 (HPD: 1.3–5.4) Ma, during the Pliocene (Fig. 1).

The divergence within the Larimus lineages was low for all markers, ranging from 0.03 to 0.3%, except for IGF1, which showed 1.6% of divergence within lineage I and 2.9% within lineage II (Table 1). The mean divergence between Larimus lineages I and II ranged from 0.5% (for Tmo-4C4) to 10.8% (for CR) (Table 1).

Table 1 Results of the genetic diversity, neutrality tests, and genetic divergence for the two Larimus breviceps lineages. Haplotype and nucleotide diversity values include standard deviations for each estimate. N, Number of individuals; NH, Number of haplotypes; NC, Not calculated.

Population, phylogeographic and demographic analyses

For these analyses we used fragments of CR (822 bps), COI (624 bps), Cyt b (786 bps), IGF1 (387 bps), and Tmo-4C4 (465 bps). Only four Tmo-4C4 haplotypes were identified, however, these were sufficient to discriminate between Larimus lineages. Therefore, this marker was included solely in part of the analyses. As the phylogenetic results indicated the presence of two distinct lineages of Larimus, found in most of the sample areas, the analyses were performed to each lineage, independently. Although variation in sample size among different locations may influence the results of genetic analyses, our findings showed no significant differences between populations within lineages I and II. Therefore, populations within each lineage were pooled for population, demographic, and phylogeographic analyses. The number of individuals, haplotypes, and genetic diversity for each marker and lineages are summarized in Table 1.

The levels of haplotype diversity in lineage I varied from 0.581 (for COI) to 0.892 (for IGF1) (Table 1). For the lineage II, the haplotype diversity was lower, and varied from 0.2452 (Cyt b and COI) to 0.661 (IGF1) (Table 1). Conversely, the nucleotide diversity was low in both groups and for all markers, ranging from 0.0003 (for Cyt b in lineage II) to 0.029 (for IGF1 in lineage II) (Table 1). Intriguingly, in general, the genetic diversity was higher in lineage I for all markers.

For all markers, the haplotype networks indicate two groups that corresponds to lineages I and II, previously reported by phylogenetic trees, separated by a minimum of two (Tmo-4C4) and a maximum of 89 (CR) mutations (Fig. 2B, C, D, E, and F). Within each group (except for Tmo-4C4), the network is star-shaped, with a few central and many single peripheral haplotypes. Additionally, within lineages I and II, the most common haplotypes were shared by populations separated by wide geographical distances, indicating the absence of geographically structured groups (Fig. 2B, C, D, E, and F).

Fig. 2
figure 2

Sampling locations and haplotype genealogy of Larimus breviceps. (A) Sampling locations and the number of individuals analyzed. The haplotype network is based on the maximum likelihood trees for the (B) CR, (C) Cyt b, (D) COI, (E) Tmo-4C4, and (F) IGF1 sequences. Each circle represents a haplotype, and the circle’s size is proportional to its frequency. The dots represent inferred, unsampled or extinct haplotypes. The map shows the geographic distribution of the haplotypes present in lineages I and II.

The Structure analysis confirmed the pattern observed in the remaining analyses, suggesting that k = 2 is the most likely (Fig. 3A). A mixed pattern was found within each lineage, however, indicating a lack of geographic sub-structuring (Fig. 3B and C).

Fig. 3
figure 3

Assignment of individuals to the two genetic lineages using the Structure algorithm, considering K = 2 for (A) Lineages I and II, as defined by the analysis, (B) the populations of lineage I, and (C) the populations of lineage II.

The isolation with migration (IM) model indicated an absence of gene flow between lineages I and II (Fig. 4A). When applied to the analysis of gene flow among the populations within each lineage, the IM model pointed to a bidirectional flow between all pairs of populations in lineage I, except for Bahia-Espírito Santo, where it was unidirectional (Fig. 4B). In lineage II (Fig. 4C), gene flow was unidirectional between pairs of populations (Ceará-Paraíba; Bahia-Paraíba; Bahia-São Paulo).

Fig. 4
figure 4

The posterior probability distributions and population migration rate (2Nm) estimates derived from the Isolation-with-Migration model (IMa2) for pairwise comparisons between (A) Lineages I and II, (B) populations of lineage I, and (C) populations of lineage II. PA = Pará; CE = Ceará; PB = Paraíba; BA = Bahia; ES = Espírito Santo; SP = São Paulo; SC = Santa Catarina; LI = lineage I and LII = lineage II.

In lineage I, Tajima’s D and Fu’s Fs were negative (Table 1), although they were significant only in the case of the CR (both tests), Cyt b and IGF1 (Fs only). In lineage II, all markers (except IGF1) presented significantly negative values for both parameters (Table 1).

For both lineages, analysis of Extended Skyline Bayesian Plot revealed a pattern of population expansion estimated to have occurred around 30 thousand years ago (Fig. 5).

Fig. 5
figure 5

Graphs of Extended Bayesian Skyline Plot based on the five loci analyzed in samples from Larimus breviceps lineage I (A) and Larimus breviceps lineage 2 (B).

Morphometric analysis

The multidimensional scaling analysis (MDS) of the 20 morphometric variables revealed a partial separation of the samples from lineages I and II (Fig. 6). The ANOVA found significant differences between groups in all the dependent variables, with individuals from lineage I having proportionately larger body sizes than those from lineage II (Supplementary Table S1). The PCA confirmed that the individuals of lineage I had head and body dimensions greater than those of lineage II, with head length (measure 1–17) and standard length (measure 1–11) contributing most to the variability in the data (Fig. 7; Supplementary Table S2).

Fig. 6
figure 6

Ordination analysis by the MDS method of the morphological measurements of the genetically defined groups of Larimus breviceps (LI and LII).

Fig. 7
figure 7

Dispersal of the scores of the first and second principal components of the (A) head and (B) body size dimensions data for each genetically-defined lineage of Larimus breviceps.

Discussion

Genetic and morphological differentiation of Larimus lineages

This is the first study that used molecular and morphological analyses to evaluate the speciation hypothesis in L. breviceps, the only valid species of the genus described from the western South Atlantic which is widely distributed from Greater Antilles and Costa Rica to Santa Catarina in Brazilian coast. Although we did not sample the species throughout its entire distribution range, all, phylogenetic, sequence divergence and population genetic analyses indicated the existence of two genetically distinct, sympatric lineages of L. breviceps. The subdivision of populations and speciation in marine ecosystems have been recorded and have been attributed to historical, physical or ecological barriers10,13,14,17,18,26.

The phylogenetic analyses indicated that L. breviceps is paraphyletic, where lineage II is closer to L. pacificus, which occurs in the eastern Pacific, than to lineage I. It is noteworthy that the ancestors of the lineages diverged in the middle Miocene, approximately 12.3 (HPD: 11.1–14.5) Ma, while the divergence between L. breviceps lineage II and L. pacificus occurred in the Pliocene, 3.4 (HPD: 1.3–5.4) Ma. All population analyses indicate that, although they are sympatric, there are deep and significant differences between lineages I and II, presumably due to the absence of gene flow between them, which corroborates the pattern revealed by the phylogenetic analyses.

Our results are in agreement with the hypothesis of two lineages as suggested by Ribeiro et al.8 and Santos et al.9. Although we did not sample specimens from the northern extreme of the species range, the presence of a lineage closer to the Pacific species (L. pacificus) is similar to the pattern identified by Silva et al.13 in a study that demonstrated speciation in the sciaenid Ophioscion punctatissimus (currently species Stellifer punctatissimus and Stellifer menezesi, see Fricke et al.2). The time of divergence between Larimus lineages (12.3 Ma; HPD: 11.1–14.5 Ma) coincides with the estimated time for the constriction of the Central American Seaway (CAS), which occurred due to the gradual uplift of the Isthmus of Panama, which began around 17 Ma ago and culminated in the closure of the connection between the Atlantic and Pacific around 3 Ma27,28,29. Between 12.9 and 11.8 Ma, the Panama sill rose about 1000 m, due to tectonic disturbances in Northwest South America. This gradual elevation of the Panama sill narrowed the CAS and reduced the flow of water from the Pacific to the Caribbean, which resulted in changes in temperature, salinity, and circulation patterns of marine currents in the Atlantic27,30,31. One of the effects of the reduction of Pacific waters to the Atlantic, in the middle Miocene, was the reversal of the flow of the North Brazil Current (NBC) from southward to northward, its current pattern31, which may have isolated populations of Larimus to the North and South of the area of influence of the NBC in the equatorial Atlantic, promoting divergence between the ancestors of the lineages. However, to make a more comprehensive analysis of the differentiation pattern of these lineages, it is necessary to carry out a sampling throughout the entire range of Larimus in the western Atlantic.

Cladogenesis between L. pacificus and L. breviceps lineage II occurred 3.4 (HPD: 1.3–5.4) Ma, which coincides with the estimated time for the closure of the Pacific-Atlantic connection due to the uplift of the Isthmus of Panama27. Therefore, the isolation of populations in the Atlantic and Pacific, promoted by the closure of the Panama Isthmus, must have prevented gene flow, and allowed them to follow independent evolutionary trajectories, culminating in the process of speciation between L. pacificus and L. breviceps lineage II. Several studies have demonstrated records of geminate species that evolved from isolated populations by the Isthmus of Panama13,26,29,30,32.

The morphometric analysis showed that head and body dimensions were larger in lineage I in comparison to lineage II, which reinforces the hypothesis of two species of Larimus from the western South Atlantic. However, limitations of these analyses must be considered, since multivariate morphometry quantitatively characterizes biological shape by simultaneously analyzing the different levels of variance and covariance between the measurements of an organism, i.e. one of the most difficult problems is to remove the effect of allometric growth in some specimens, especially given that in this study we used individuals collected in different regions, which reflects the richness of the environments and growth performance33,34,35.

Differences in body shape may be associated with ecological adaptations to distinct ecological niches, related to variations in habitat quality and feeding strategies36, although no such differences can be confirmed in the present case. However, variations in behavioral parameters, such as feeding strategies, reproductive patterns, competition, and even the position of the animals in the water column, may contribute to this differentiation. Ecomorphology in fish, as in other organisms, reflects pressures to which species are exposed, whether from relationships with other species or adaptability demands37,38. In Sciaenidae, geometric morphometrics has helped distinguish cryptic species that overlap in most morphological traits10,35, with contrasting patterns associated with distinct habitat use. This reinforces the need for further, detailed studies of the biology and ecology of each group to evaluate the processes that drove morphological differentiation.

Considering the deep genetic divergence and subtle morphological differences, we propose the presence of two cryptic species of Larimus in the western South Atlantic. Thus, a comprehensive revision of the alpha taxonomy of the Larimus species is warranted, including a systematic examination of type specimens or representative material to formally describe the new species within this genus in the western South Atlantic.

Population genetics, phylogeography, and demographic history of Larimus lineages

In general the haplotype diversity was higher in lineage I, while the two lineages showed low nucleotide diversity for all the markers. The higher haplotype diversity in lineage I could be related to sampling, which is greater in this group. However, the two lineages showed low nucleotide diversity for all markers, which may reflect aspects related to the evolutionary history of the groups.

The pattern of high haplotypic and low nucleotide diversities observed in lineage I can be indicative of rapid population expansion and accumulation of mutations after a period of low effective population size caused by bottlenecks39. On the other hand, the low genetic diversity in lineage II may be related to a historical reduction of population size by bottleneck or founder effects39. Both groups are sympatric in regions which went through severe environmental variations due to climatic oscillations during the Pleistocene. During glacial periods, the sea level in the western South Atlantic decreased between 100 and 140 m, sea surface temperature may have reached down to 6oC lower than nowadays temperature and there were changes in the pattern of circulation of marine currents23,24. Therefore, paleoenvironmental changes may have restricted the availability of habitats for coastal and estuarine species, such as Larimus lineages, reducing the effective population size and, consequently their genetic diversity, as proposed for other marine taxa40,41,42. On the other hand, when climatic conditions become favorable, during interglacial periods, there was a demographic growth (discussed below) with a consequent increase in diversity levels. However, considering the different patterns of genetic diversity between both lineages of Larimus, it is likely that lineage II was more affected by the climatic oscillations, although it is necessary a deeper study to test such assumption.

The population analysis of each Larimus lineage indicates that there is no geographic sub-structuring. The star-shaped haplotype networks of lineages I and II suggest the lack of differentiation within groups, and the Structure analysis did not detect any significant genetic sub-structuring within each group. Furthermore, the migration pattern indicated by IMa, characterized by high levels of gene flow between pairs of populations within each group, suggests that neither circulation patterns, sea surface temperatures nor habitat discontinuities influence the observed genetic structure of the groups. This evidence indicates that each group represents a large panmictic population in the western South Atlantic. High genetic connectivity in marine species generally is attributed to the dispersal capability of adults, eggs and larvae which can be strengthened by marine currents43,44,45. Larimus breviceps is classified as a migrant marine fish46, although there is no information regarding the distance that this species may reach out during dispersion. So, it is likely that there might be a displacement of individuals to adjacent areas which can favor the homogeneization of populations for each lineage. Yet, a deeper understanding of the genetic, biological and ecological aspects are critical for the comprehension of patterns of structuration and the connectivity of groups.

With regard to the demographic history of the two groups, the negative and significant neutrality tests, for most markers, indicate a potential pattern of population expansion in both lineages (Table 1). When the D and Fs values are significantly different from zero, they indicate a deviation from neutrality, and tend to be negative when there is an excess of recent mutations, suggesting deviations caused by population expansion, or other evolutionary processes, such as hitchhiking, background selection, or recombination47. Both, demographic fluctuations and selection have similar signatures in populations, however, contraction or expansion of populations can leave signatures throughout the genome, while selection generally affects genomic regions of functional importance48. Therefore, the analysis of multiple loci allows for distinguishing between stochastic demographic processes and selection49.

The EBSP also suggests population expansion in both lineages of Larimus estimated to have occurred about 30 thousand years ago during the Pleistocene, coinciding with an interglacial period in the southwest Atlantic23,24. During interglacial episodes, there was an increase in marine temperature as well as the sea level increased over the continental shelf, which may have allowed the establishment of new habitats that favored the expansion of Larimus lineages in the western South Atlantic. Demographic expansion in other teleosts occurring in the western South Atlantic has been registered and generally is associated with Pleistocene climate changes10,40,42,43,44,45.

The sum of the evidence from the present study indicates a process of allopatric speciation, resulting in the formation of two monophyletic lineages, which can be considered distinct species, based on the phylogenetic species concept50. These two species subsequently expanded their ranges and now coexist along the western South Atlantic. Despite the clear differentiation found at the molecular level, the two species present subtle morphometric differences and might be thus considered to be cryptic species. As the two groups may occupy similar niches, their morphological similarities are likely due to stabilizing selection, which may reinforce the retention of morphological characteristics that enhance fitness in their respective habitats, irrespective of genetic differentiation, a common pattern observed in cryptic Sciaenidae species, such as Macrodon and Stellifer in the western South Atlantic. However, this hypothesis requires further systematic testing. These findings also reinforce the need for a more comprehensive sample of populations and genetic analyses from the area of occurrence of Larimus in the western Atlantic, as well as behavioural studies, ecological niche modelling, and studies of the biology of the two taxa, to define the geographic limits of the two species and assess potential hidden diversity in Larimus in the northern extreme of the species range. Finally, our results indicate that Larimus lineages should be managed as distinct species, implementing appropiate management strategies to guarantee their genetic integrity, particularly in areas like the Brazilian coast where they are impacted by overexploitation.

Materials and methods

Sampling

A total of 353 specimens were sampled, by artisanal fishing, along the western South Atlantic in the Brazilian states of Pará (N = 35), Ceará (N = 36), Paraíba (N = 53), Bahia (N = 60), Espírito Santo (N = 73), São Paulo (N = 43), and Santa Catarina (N = 53) (Fig. 2). The tissue of Larimus pacificus (N = 1) was donated by the Fish Collection of the Zoology Museum of the University of Costa Rica (UCR). The specimens were identified morphologically by consulting the specialized literature3, and samples of muscle tissue were preserved in absolute ethanol and/or frozen until processing in the laboratory.

In Brazil, the samples were purchased from artisanal fishermen, and permission to undertake collection, handling, transportation, and DNA extraction was obtained by Dr. Simoni Santos from the Brazilian Environment Ministry (Permit number 18401-3). The sample from Costa Rica was collected under permissions of the Sistema Nacional de Áreas de Conservación (Permit number R-SINAC-SE-DT-PI-003-2021) and the Comisión Nacional para la Gestión de la Biodiversidad (Permit number R056-2015-OT-CONAGEBIO), accessed through Resolución No. 377 of the Vicerectoría de Investigación of the UCR.

DNA extraction, amplification, and sequencing of genomic regions

The total DNA was extracted using a DNeasy kit (QIAGEN) following the manufacturer’s protocol. To check the quality of the material obtained, the DNA was electrophoresed in agarose gel (1%) stained with Gel Red (Biotium Inc., CA), with the runs being viewed under ultraviolet light.

The fragments of the Control Region (CR), Cytochrome C Oxidase subunit I (COI), Cytochrome b (Cyt b), Tmo-4C4, and intron 2 of the Insulin-like Growth Factor 1 (IGF1) were amplified by Polymerase Chain Reaction (PCR) using the specific primers and amplification protocols described in Table 2. The PCR reactions were run in a total volume of 25 µL containing 4 µL dNTPs (1.25 mM), 2.5 µL buffer (10X), 2 µL MgCl2 (25 mM), 0.25 µL of each primer (200 ng/µl), 1–1.5 µL genomic DNA (100 ng/µL), 1 U of Taq DNA Polymerase (5 U/µL), and purified water. The amplified products were purified with the ExoSAP-IT enzyme (Amersham Pharmacia Biotech, Inc., UK) according to the manufacturer’s instructions and sequenced using the Sanger method with a BigDye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems), following the manufacturer’s instructions. Electrophoresis was conducted in an ABI 3500 XL (Applied Biosystems).

Table 2 Primers and amplification conditions of the markers used for the analysis of the Larimus breviceps from the western South Atlantic.

Phylogenetic analysis, divergence time estimates, and sequence divergence

The phylogenetic analyses were conducted using a dataset comprising all nuclear and mitochondrial markers derived from 79 individuals: 50 from Larimus lineage I and 29 from Larimus lineage II. Specimens were selected based on haplotypes observed in the Control Region, for which sequences of all markers were available, except IGF1, which was not sequenced in 12 specimens (02 from lineage I and 10 from lineage II). The sequences were aligned in CLUSTAL W with default parameters59 and edited in BioEdit 7.2.560. Degenerated bases were used in the heterozygous sites of nuclear datasets. The following sciaenids species were used as outgroups, as they showed high homology score with Larimus lineages in BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and that are common to most sequence data used in the phylogenetic analyses, Nebris microps Cuvier, 1830 (COI - NC029875, Cyt b - KP722653, Tmo-4C4 - JX904046, CR - NC034350) a taxon closely related to Larimus61, and Sciaenops ocellatus Linnaeus, 1766 (COI - MF004315, Cyt b - KP722687, CR - NC016867, Tmo-4C4 - KC130636, and IGF1 - GU799603).

To infer the evolutionary relationships of L. breviceps lineages, we perform the Bayesian analysis (BI) for each marker in Mr. Bayes 3.2.762. Optimal nucleotide substitution models were selected for each partition in PartitionFinder v2.1.163 based on the Bayesian Information Criterion (BIC). The HKY + Gamma model was selected for the markers Cyt b and CR, the HKY + I model for COI, the F81 model for IGF1, and K81UF model for Tmo-4C4. The BI analyses were conducted with four simultaneous Markovian Monte Carlo chains (MCMC), running for 1,000,000 generations with parameters defined by each model as starting values, and sampled every 1,000 generations, with a 25% burn-in applied to ensure robust convergence of the phylogenetic trees. Convergence diagnostics were assessed through effective sample size (ESS) values in Tracer 1.7.164 and by checking the Potential Scale Reduction Factor (PSRF) criterion provided by Mr. Bayes. Upon completion, consensus trees were generated with posterior probability values providing node support.

We performed a Bayesian analysis, using StarBEAST365 to estimate the divergence time of Larimus lineages in BEAST v. 2.7.766. The analysis was performed using the HKY + I + G for partitions, using the relaxed clock (log-normal) method, assuming independent rates on different branches67. The prior trees were modeled according to the Yule process of speciation, while all other prior information was based on the default values available in BEAST 2.7.7. The time calibration was estimated using sciaenid fossils, Larimus henrici and Larimus steurbauti from the Early Miocene Cantaure Formation (i.e., 23–16 Ma)68, used to constrain the TMRCA of the clade Larimus. Two independent runs were conducted with a chain length of 100 million being sampled every 100 iterations through the speciation process based on the Yule model and 10% burn-in. The results were checked using Tracer 1.7.164 and then transferred to TreeAnnotator 1.869 for the inclusion of calibration times in the branches.

All the trees were visualized in the FigTree 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/) and edited in the Inkscape 1.1 (https://inkscape.org/pt-br/).

We estimated the mean of uncorrected p genetic divergence between the Larimus lineages I and II from each dataset on MEGA 1170.

Population, phylogeographic and demographic history analyses

These analyses were based on 353 individuals for the CR, 96 for COI and Cyt b, 94 for Tmo-4C4, and 45 for IGF1. For the nuclear markers, the gametic phase for the heterozygous sequences was defined in the PHASE algorithm71, using the default parameters, and the haplotypes with probabilities of less than 0.8 were not included in the analyses. The sequences were edited and aligned in BioEdit, and the haplotypes were defined using the DNAsp5 program72.

The haplotype and nucleotide diversity for each marker (except Tmo-4C4), was obtained using ARLEQUIN 3.573.

The genealogy and distribution of haplotypes among populations were evaluated by generating the haplotype network for each marker in the software Haploviewer74, based on unrooted Maximum Likelihood (ML) trees.

The alignments of the different markers were processed, and the informative sites were identified in the Fasta2Structure software75, which converts this information into the input format for the Structure software. The Structure 2.3.476 was used to infer the number of genetic demes in the samples, based on an admixture model with uncorrelated allele frequencies and without prior information on sample population membership, a burn-in period of 500,000, and a run length of 106 iterations. One to four clusters or demes were tested, and for each value of k, 10 independent simulations were run. The most likely k was defined by comparing the log-likelihood and Evanno’s ∆K using the softwate Structure Harvester77.

Migration patterns, using asymmetric rates, between groups and among populations within the same group were assessed including all the genetic markers using the isolation with migration approach implemented in the IMa 2.0 program78. In the run, 10 MCMCMC were used with linear heating mode. Multiple preliminary runs were conducted for prior adequacy and the mixing of chains. Different runs using different seeds were conducted to confirm the results, all of which used the same conditions, 50,000,000 steps and 10% burn-in. Both convergence and ESS values (> 200) were checked to determine the quality of the runs.

Tests of selective neutrality, D79 and Fs47, were run in ARLEQUIN, to check for deviations from neutrality in all the markers (except Tmo-4C4), and to provide inferences on their demographic history.

The demographic history of the L. breviceps populations was evaluated using five markers with Extended Bayesian Skyline Plot method (EBSP) method80, implemented in BEAST66. We used the GTR + I + G model for the partitions as suggested by the PartitionFinder 2.1 software. A relaxed clock with an uncorrelated lognormal distribution was assumed, and the markers’ ucld.stdev values were found to be close to zero, indicating that the hypothesis of a uniform replacement rate was not rejected. Consequently, a strict clock with an evolutionary substitution rate of 6.2% per million years was used, considering a minimum rate of 3.6% and a maximum rate of 8.8% for the control region81,82. After the analysis was completed, runs with ESS values greater than 200 were verified using TRACER 1.7.164, and the graph was generated using a Python script provided by Heled83.

Morphometrics analysis

All measurements were taken from the left side of the body, following84, who define 20 truss length measurements for the definition of body shape and the proportions in relation to structures such as head and abdomen (including the caudal peduncle) (Fig. 8). To reduce the error due to the positioning of the specimens on the image recording screen, the specimens were carefully measured with a caliper and then the measurement was confirmed using images. A total of 160 individuals used in the molecular analyses, representing all populations (except Espírito Santo), were measured, irrespective of sex, with the values being recorded in spreadsheets for subsequent statistical treatment.

Fig. 8
figure 8

Adapted from Parsons et al. 84.

The 20 truss lengths measured on each specimen for the morphometric analysis. (A) body measurements; (B) head measurements. 1–2: length of the snout to the origin of the dorsal fin, 1–3: length of snout to the origin of the pelvic fin, 1–11: standard length, 2–3: body depth, 2–9: dorsal base length, 4–5: length of pectoral fin, 3–7: length from the origin of the pelvic fin to the posterior margin of the anal fin, 6–9: length from the origin of the anal fin to the posterior of the dorsal fin, 2–7: length from the anterior margin of the dorsal fin to the posterior margin of the anal fin, 9 − 7: depth of the anterior caudal peduncle, 8–10: depth of the posterior caudal peduncle, 9–10: length of the dorsal caudal peduncle, 7–8: length of the ventral caudal peduncle, 9 − 8: distance from the dorsal anterior to the ventral posterior caudal peduncle, 7–10: distance between the ventral anterior and dorsal posterior caudal peduncle, 1–12: snout length, 12–13: eye diameter, 1–17: head length 14–15: cheek depth, 15–16: lower jaw length.

The measurements were analyzed individually with a one-way ANOVA (α = 5%), considering the genetically defined L. breviceps lineages I and II as categorical variables, including validation of the residuals using the normality (Shapiro Wilks) and heteroscedasticity (Levene) tests. A quadratic matrix was constructed considering the Bray Curtis distance squared between numerical variables, where multidimensional scaling (MDS) and principal components analysis (PCA) have been applied. The MDS method spatially distributes measures considering a ranking of distance, which confirms the similarity between the categories tested85,86. The stress value was used as a representative measure of the groups, where values < 0.20 were considered acceptable87. Likewise, PCA is a factorial model in which the factors are based on variance, which is explained according to vectors arranged in two or more axes. For this analysis, the first two axes were used, and their percentage of explanation of the variance was shown in tables. Before starting the calculation, the data were normalized (Z = (x - µ) / σ) to reduce stress. All the statistical analyses in this study were carried out using the PAST software88.