Introduction

The freshwater ichthyofauna of the north Mediterranean region, which represents an important endemic diversity when compared with that of northern Europe, is in urgent need of conservation assessment and action, with little information available on distribution and conservation status of many species (Crivelli & Maitland, 1995a,b). Cyprinids in general, and particularly species of the genus Chondrostoma, while once very abundant in European rivers, have in recent times reached endangered status. In comparison to the salmonids (Laikre, 1999) there have been relatively few population genetic diversity studies of cyprinids with the purpose of defining units (evolutionary divergent populations) for conservation strategies.

The Portuguese nase, Chondrostoma lusitanicum Collares-Pereira 1980, a small cyprinid fish that inhabits shallow streams of medium flow with vegetation on the banks, is endemic to southern Portugal and is listed as rare in the Portuguese Vertebrate Red Data Book (SNPRCN, 1991). The species has a highly restricted distribution (Fig. 1), being found in the Tejo (only the western tributaries) and Sado river basins, where it has a particularly discontinuous distribution and small population size, in the small Samarra basin near Tejo, and in the small southern Mira and Arade basins, where this fish is somewhat more abundant (Collares-Pereira, 1983; Nelva et al., 1988; Alves & Coelho, 1994; Coelho et al., 1997). Several authors (Vrijenhoek et al., 1985; Meffe, 1986, 1987, 1990; Allendorf & Leary, 1988; Meffe & Vrijenhoek, 1988) have highlighted the genetic consequences (through genetic drift, bottlenecks and inbreeding) of such small and isolated local populations, resulting in low genetic variability within, and high genetic divergence between populations, and have stressed the importance of population genetic data for the management of endangered species.

Fig. 1
figure 1

River basins in southern Portugal from which fish were collected. The shaded area corresponds to the known distribution of Chondrostoma lusitanicum

Several small studies using allozyme variation in C. lusitanicum have shown a high degree of population subdivision within the Tejo basin and a lack of gene flow between local populations (Alves & Coelho, 1994), and suggested low levels of intra-population variation and high levels of between-basin differentiation (particularly concerning the southern basins) (Coelho et al., 1997). One factor suggested to contribute to levels of genetic divergence observed between the isolated southern populations and other populations is bottlenecks due to high mortality resulting from the periodic drying-up of extensive sections of these small southern rivers during summer (Coelho et al., 1997). In relation to the individual biogeographical nature of the southern region already established for other coexisting cyprinid species of the genus Leuciscus (Coelho et al., 1995, 1998), the genetic divergence of the southern populations of C. lusitanicum may merit re-classification of their specific status. The recognition of special status for genetically divergent populations of C. lusitanicum as different species or separate Evolutionary Significant Units (ESUs) can be expected to have profound implications for the protection and management of this endangered species.

The aim of the present study was to use mitochondrial DNA (mtDNA) variation in this species to provide a more accurate determination of genetic divergence and of phylogenetic relationships, and to extend and complement the existing allozyme data, in order to facilitate development of a rational programme of conservation. The advantages of mtDNA over allozyme analyses for biogeographical comparisons are well known (Wilson et al., 1985; Avise et al., 1987; Moritz et al., 1987; Moritz, 1994a). Population characteristics of Portuguese nase such as fragmentation and isolation, plus unbalanced sex ratios (2 males : 1 female) observed amongst adults in the Mira basin (Magalhães & Collares-Pereira, unpublished data), may render them especially susceptible to the genetic drift, bottleneck and inbreeding processes to which the mtDNA is particularly sensitive.

In this study we examined population structure and divergence in C. lusitanicum with samples from most of the species’ geographical range, using direct sequencing of the cytochrome b (cyt b) gene, and restriction fragment length polymorphism (RFLP) analysis of the region coding for the NADH subunits 5 and 6 (ND-5/6). Data from the two different methods (sequencing and RFLP) are compared to assess the facility of the simpler RFLP method for application to describing phylogeographic patterns in such a freshwater fish. Data are compared with the levels of genetic variability and divergence obtained in previous allozymes analysis, with particular emphasis on the high levels of genetic divergence detected for the southern (Mira and Arade) populations. The results are discussed, in relation to the conservation of this highly fragmented species, in terms of Evolutionary Significant Units and Management Units.

Materials and methods

Sample collection and DNA extraction

Samples of C. lusitanicum were collected from five different basins (Fig. 1): Tejo (N=2), Samarra (N=10), Sado (N=10), Mira (N=10) and Arade (N=10). The specimens were transported in liquid nitrogen and stored at −80°C.

Total DNA was extracted from fins and muscle tissue following standard protocols of incubation with SDS and proteinase K, followed by phenol–chloroform extraction (Sambrook et al., 1989).

Amplification and sequencing analysis of the cyt b gene

Amplification and sequencing analysis of cyt b gene was undertaken for only a subset of individuals sampled: 2 from Tejo, 5 from Samarra, 4 from Sado, 4 from Mira and 4 from Arade.

The mitochondrial cyt b gene was amplified using primers LCB1 [5′-AAT gAC TTg AAg AAC CAC CgT-3′] (Brito et al., 1997) and HA [5′-CAA CgA TCT CCg gTT TAC AAg AC-3′] (Schmidt & Gold, 1993). PCR amplifications were performed under the following conditions: 30 cycles of 60 s at 94°C, 30 s at 50°C and 60 s at 72°C, using a Perkin Elmer 2400 thermal cycler. Reaction mixes contained 100 ng template DNA, 2.5 mM MgCl2, 0.2 mM each nucleotide, 0.2 μM of each primer, 1 U Taq polymerase (Gibco BRL) with the manufacturer’s supplied 1× KCl buffer, in a final reaction volume of 50 μL. Double-stranded amplification products were purified with QIAquick PCR Purification Kit (Qiagen) and cycle sequenced (Amersham Thermosequenase) in both directions with the same primers (5′ end-labelled with a Cy5 fluorescent dye group), run out on an ALFexpress automated sequencer (Pharmacia Biotech).

Amplification and RFLP analysis of the ND-5/6 gene region

RFLP analysis of the ND-5/6 region, comprising the genes coding for the NADH subunits 5 and 6, was undertaken for all individuals. The ND-5/6 region was amplified using primers ND56-L (5′-AAT AgT TTA TCC gTT ggT CTT Agg-3′ — Cronin et al., 1993), and ND56-H (5′-gTT gAA TgA CAA Tgg Tgg TTC TTC-3′ — Toline & Baker, 1995), which amplify a product of approximately 2.5 kb. To each 25 μL amplification reaction was added 25–50 ng of template DNA, 0.25 μL of each primer (20 μM) and 22.5 μL of PCR Super Mix (1.1×; Gibco BRL, Life Technologies), with the following composition: Tris-HCl 22 mM (pH 8.4), KCl 55 mM, MgCl2 1.65 mM, 220 μM dNTPs, 22 U of Taq DNA Polymerase/mL and stabilisers (as in the supplier protocol). The amplification reactions, conducted in a GeneAmp PCR System 2400 (Perkin Elmer) thermal-cycler, consisted of 32 cycles of 95°C for 35 s, 50°C for 30 s and 72°C for 120 s.

The ND-5/6 amplification products were digested with 13 endonucleases, recognizing sequences of four, five and six nucleotides: BstNI, BstUI, DdeI, HaeIII, HhaI, HindIII, HinfI, HpaII, MboI, NciI, RsaI, StyI and TaqI. Two further enzymes, BamHI and EcoRI, did not cut the amplified product, and one, AluI, produced inconsistent restriction patterns. Each restriction reaction contained 150–300 ng of amplification product, digested according to individual enzyme suppliers’ directions. Restriction fragments were separated on a 2% agarose gel, stained with ethidium bromide, and the size of fragments estimated relative to a standard 100 bp DNA Ladder (Pharmacia Biotech).

Statistical analysis

Cyt b sequences were aligned using CLUSTAL X v.1.5b (Thompson et al., 1994). Estimates of sequence divergence (distance) between individuals and samples were calculated according to Kimura’s (1980) two-parameter (K-2) method.

Within-sample variation, using haplotype (Nei, 1987) and nucleotide (Nei & Tajima, 1981; Nei, 1987) diversity, and between-sample variation, using nucleotide divergence (Nei & Tajima, 1981; Nei, 1987), were calculated for both cyt b sequence and ND-5/6 RFLP data using REAP v.4.0 (McElroy et al., 1992). RFLP analysis was done with and without the inclusion of a dummy shared character to avoid overestimation of the evolutionary distance between haplotypes not sharing characters at particular loci, as pointed out by McElroy et al. (1992); estimates obtained were essentially the same, so we present here only the results without the inclusion of the false character. Geographical heterogeneity in ND-5/6 RFLP composite haplotype frequencies among basins was tested using a Monte Carlo randomization, available in REAP.

Patterns of genetic divergence between geographical samples, using pairwise nucleotide divergence values for both cyt b and ND-5/6 datasets, and between sequence/RFLP haplotypes, using K-2 distances (cyt b) or evolutionary divergence d (ND-5/6), were displayed by neighbour-joining trees produced using the NEIGHBOUR algorithm in PHYLIP v.3.5 (Felsenstein, 1993). Phylogenetic trees of cyt b sequences were generated using both maximum-parsimony and neighbour-joining methods using PAUP v.4.0d (Swofford, 1998). Published cyt b sequences for Chondrostoma polylepis Steindachner 1865 (EMBL Acc. No. Z75108 — Brito et al., 1997) and Chondrostoma arcasi (Steindachner 1866) (EMBL Acc. No. X99424 — Alves et al., 1997) were used as outgroups in rooting trees. Maximum-parsimony analysis used heuristic search, random stepwise addition and tree bisection–reconnection methods. For the neighbour-joining distance trees, sequence divergence was calculated according to Kimura’s (1980) two-parameter and Jukes’s & Cantor’s (1969) methods.

In order to test for congruence in genetic distances among samples estimated from the two mtDNA methods, a Mantel test, with 10 000 random permutations, was carried out in GENEPOP v.3.1b (Raymond & Roussel, 1995), using pairwise values of K-2 distance (cyt b) and evolutionary divergence d (ND-5/6) as the input matrices.

A hierarchical analysis of the geographical partitioning of genetic variation (variance components and Φ-statistics) within the data set was performed within WINAMOVA v.1.55 (Excoffier et al., 1992), using K-2 distance for cyt b and evolutionary divergence d (Nei & Li, 1979; Nei & Tajima, 1983; Nei, 1987) for ND-5/6. This analysis shows the haplotypes correlation at different levels of the hierarchical subdivision: ΦST indicates the correlation level among randomly chosen haplotypes from one basin, in comparison to the correlation level of pairs of randomly chosen haplotypes from all the sampled basins; ΦCT indicates the correlation level among randomly chosen haplotypes from one geographical group, in comparison to the correlation level of pairs of randomly chosen haplotypes from all geographical groups; ΦSC indicates the correlation level among randomly chosen haplotypes from one basin, in comparison to the correlation level of pairs of randomly chosen haplotypes from that geographical group (Excoffier et al., 1992). Significance of variance components and Φ-statistics was tested against the null distribution generated by 1000 random permutations.

Results

A 985 bp region of cyt b was successfully sequenced in almost all samples. Nucleotide composition of this region in C. lusitanicum was skewed, with a deficit of guanosine (16.4%) compared to approximately equal frequencies of adenosine (26.4%), cytosine (27.9%) and thymine (29.4%). Seventy-eight (7.92%) base positions were variable (Table 1), 66 (6.70%) being parsimony-informative, and representing 66 transitions and 14 transversions (two sites show both transition and transversion changes), with no deletions or insertions. Most substitutions (59%–75.6%) were in third codon positions, with only 15 (19.2%) in first and four (5.1%) in second codon positions, resulting in 14 (4.3%) amino acid changes.

Table 1 Chondrostoma lusitanicum cyt b sequence haplotypes and variable sites in the 985 bp region sequenced. Parsimony-informative characters are given in bold and dots indicate equality with sequence 1. Haplotype frequencies within the five areas sampled (T, Tejo; R, Samarra; S, Sado; A, Arade; M, Mira) are shown

Within- and between-sample variation

Sixteen cyt b sequence haplotypes, differing by 1–59 substitutions, were observed among the 19 individuals screened (Table 1), with no haplotypes shared between any pair of basins. Pairwise sequence divergence varied from 0.10% to 6.34%, with the smallest values found between specimens from the same basin (0.10%–0.71%), between specimens from the Mira and the Arade basins (0.32%–0.61%) and between specimens from the Tejo and the Samarra basins (0.41%–0.71%). The largest differences were found between the specimens from the Mira and the Arade basins and all the others (5.28%–6.34%). Fish from Sado showed less sequence divergence from the Tejo and Samarra specimens (1.65%–2.38%) than from the Mira and Arade specimens (5.39%–6.11%). Comparisons with the outgroups revealed uniformly high levels of sequence divergence: 9.13%–10.10% against C. polylepis; and 11.06%–12.35% against C. arcasi. Within-basin haplotypic diversity was very high (Table 2), ranging from 0.667 (±0.2041) in Tejo to 0.889 (± 0.0596) in Samarra (average=0.825 ± 0.0016), with only two of the 16 haplotypes being shared by more than one individual (Table 1). Within-basin nucleotide diversity however, was very low (Table 2), ranging from 0.001 in Tejo to 0.004 in Mira (average=0.002 ± 0.0000) reflecting the high similarity of sequences within basins (Table 1). High haplotype diversity with low nucleotide diversity within samples may suggest the action of nonequilibrium evolutionary or non-neutral forces such as ‘founder-flush’ population demographics or the persistence of slightly deleterious mutations in small populations due to ineffectiveness of ‘cleansing’ selection. To test this we calculated Tajima’s (1989) D, and tested it for significant departure from zero, using ARLEQUIN v.2.0 (beta 2) (Schneider et al., 1999). All samples except Tejo exhibited substantial, but not significant, negative values of D (Tejo: D=0.000, P=−1.00, n=2; Samarra: D=−1.048, P > 0.10, n=5; Sado: D=−0.754, P > 0.10, n=4; Mira: D=−0.389, P > 0.10, n=4; Arade: D=−0.212, P > 0.10, n=4).

Table 2 MtDNA nucleotide divergence (Nei & Tajima, 1981; Nei, 1987) between samples of Chondrostoma lusitanicum, for cyt b sequence data (above diagonal) and ND-5/6 RFLP data (below diagonal), and haplotype diversity (Nei, 1987) and nucleotide diversity (Nei & Tajima, 1981; Nei, 1987) within samples for cyt b sequence data and ND-5/6 RFLP data

Restriction of the 2.5 kb ND-5/6 product with 13 endonucleases revealed a total of 107 restriction fragments, comprising 11 composite haplotypes, from the 42 individuals sampled (Table 3). In general, the haplotypes exhibited a distribution restricted to only one basin, with only haplotype 6, the most common haplotype in the Mira and Arade basins, being found in more than one basin (Table 3). Average within-basin haplotype diversity was 0.257 (±0.0272) and the average nucleotide diversity was 0.001 (±0.0000) (Table 2). Such low values result principally from the Tejo, Samarra and Arade samples showing only a single haplotype (Tables 2 and 3). The Sado and Mira samples possessed higher levels of variation, with more than one haplotype present, but still with one haplotype representing between 40% (Mira) and 70% (Sado) of individuals (Tables 2 and 3).

Table 3 Chondrostoma lusitanicum ND-5/6 RFLP composite haplotypes, restriction enzyme phenotypes per endonuclease, and haplotype distribution frequencies across the five basins sampled. The restriction enzymes used were: 1, BstNI; 2, BstUI; 3 DdeI; 4, HaeIII; 5, HhaI; 6, HindIII; 7, HinfI; 8, HpaII; 9, MboI; 10, NciI; 11, RsaI; 12, StyI; 13, TaqI. The basins sampled were: T, Tejo; R, Samarra; S, Sado; A, Arade; M, Mira

Values of nucleotide divergence between samples (basins) for the cyt b sequence data ranged from 0.09% to 5.90% (average=3.78%), and for ND-5/6 RFLP data ranged from 0.10% to 8.62% (average= 5.48%) (Table 2). For both analyses, the lowest values of nucleotide divergence were found between the Mira and Arade (cyt b=0.09%; ND-5/6=0.10%) and between the Tejo and Samarra (cyt b=0.41%; ND-5/6=0.36%) samples, with the highest values found in comparisons between Mira or Arade and the other basins (cyt b=5.38%–5.90%; ND-5/6=7.34%–8.62%) (Table 2). Tests for heterogeneity in haplotype frequency distributions among samples gave highly significant values for both cyt b2=76.00, P < 0.001) and ND-5/62=144.00, P < 0.001) datasets, χ2-values exceeding all 1000 values in Monte Carlo simulations.

Neighbour-joining clustering of nucleotide divergence (both cyt b and ND-5/6) between samples, and of K-2 (cyt b) or evolutionary divergence d (ND-5/6) distances between individual haplotypes, all presented essentially the same topology (Fig. 2). All four trees in Fig. 2 show close clustering of Tejo with Samarra and Mira with Arade, with Sado clustering with Tejo/Samarra before all three join Mira/Arade. A Mantel Test comparing pairwise estimates of genetic distance between samples estimated from the two different datasets, using K-2 (cyt b) distances and d (ND-5/6), demonstrated that they were significantly correlated (P0.001).

Fig. 2
figure 2

Chondrostoma lusitanicum: unrooted neighbour-joining trees of nucleotide divergence (Nei & Tajima, 1981; Nei, 1987) between samples for cyt b sequence data (a) and ND-5/6 RFLP data (b), and of Kimura 2-parameter distances between cyt b sequence haplotypes (c) and evolutionary divergence d (Nei & Li, 1979; Nei & Tajima, 1983; Nei, 1987) between ND-5/6 RFLP composite haplotypes (d). Cyt b sequence haplotypes: 1–16. ND-5/6 RFLP haplotypes: 1–8, 11.

Phylogenetic relationships

The phylogenetic trees resulting from maximum-parsimony, and neighbour-joining of Kimura 2-parameter and Jukes–Cantor distances, revealed essentially the same topology, differing slightly only on poorly resolved nodes — the NJ tree of K-2 distances is presented in Fig. 3. Sequence haplotypes cluster consistently with all other individuals from their own basin and separately from other basins, except for mixing of the Mira/Arade sequences — the Samarra group is consistently supported by bootstrap values above 79% based on 4 base positions (not shown in Fig. 3). Three distinct geographical groups are apparent, with high bootstrap support (>98) on nodes: Tejo/Samarra, Sado, and Mira/Arade. The Tejo/Samarra–Sado vs. Mira/Arade division is also very clear, based on differences at 28 base positions and 100% bootstrap support.

Fig. 3
figure 3

Neighbour-joining tree of Chondrostoma lusitanicum cyt b sequences, using Kimura’s (1980) 2-parameter distance. Number of substitutions (above) and percentage of 1000 bootstrapped replicates (below, in bold) that support each branch of the tree, presented only for branches representing geographical regions. C. polylepis and C. arcasi were used as outgroups to root the tree.

Partition of the mtDNA variation

The neighbour-joining analyses of genetic divergence among samples and haplotypes (Figs. 2 and 3) and the high bootstrap support (Fig. 3) suggested a hierarchical analysis of the genetic variation with the definition of three (Tejo/Samarra — Sado — Mira/Arade) distinct geographical divisions. Analysis of molecular variance (AMOVA), made in accordance with these geographical groupings gave very similar results for the cyt b sequence data and the ND-5/6 RFLP data. Levels of interbasin genetic divergence were very high (cyt b ΦST=0.955; ND-5/6 ΦST=0.975), but so were levels between geographical groupings (cyt b ΦCT=0.909; ND-5/6 ΦCT=0.953). The apportioning of total variance illustrates very well the geographical patterns achieved: very little variation within samples (cyt b=4.51%; ND-5/6=2.55%) and between basins within geographical groupings (cyt b=4.58%; ND-5/6=2.18%), and the majority of variance between groups (cyt b= 90.91%; ND-5/6=95.27%). Estimates of variation within samples and groups were almost double for the sequence analysis compared to the RFLP analysis. Permutation testing of Φ-values showed that the partitioning of variance at all levels of the hierarchy was highly significant (P < 0.001).

Pairwise values of ΦST among samples (values not shown) were high and highly significant (of the same order as the overall ΦST), with the exception of the two southern basins, Mira and Arade. ΦST values were still high between these two basins (cyt b=0.213; ND-5/6= 0.325), and significant for the RFLP data (P=0.010), but not for the sequence data (P=0.107).

Discussion

Although the data of the current study are not extensive, in particular for the Tejo basin due to extreme difficulties in locating this cyprinid in recent surveys, they give clear, unambiguous indications of levels of phylogeographical structuring within populations of Chondrostoma lusitanicum in Portugal. Data on levels of genetic variation within populations are informative, but estimates of number of genotypes should be viewed with caution in view of the limited sample sizes.

The present results on mtDNA variation within C. lusitanicum indicate low values of genetic diversity within basins. The cytochrome b sequence analysis, despite uncovering relatively high values of haplotype diversity (16 different haplotypes in 19 individuals), indicated very low values of within-basin nucleotide diversity (i.e. all individuals are very similar genetically). Negative values of Tajima’s (1989) D within almost all samples may indicate that populations have been subject to recent bottlenecks with subsequent expansion, or that population sizes have been consistently low for some time. Given the information known about population demographics in this species [seasonal high mortality in southern rivers (Coelho et al., 1997), fragmented population structure in the Tejo river (Alves & Coelho, 1994)], it would seem that repeated bottlenecks with subsequent recovery of populations is the most likely explanation. The ND-5/6 RFLP data similarly indicated very low values of nucleotide diversity, but also of haplotype diversity within basins. It should be noted however, that sample sizes used were not ideal for estimating within-sample variation, so values presented should only be viewed as indicating relative levels of variation. In contrast to the within-basin picture, there was considerable genetic divergence among basins, with fixation of different ND-5/6 RFLP haplotypes and distinct cyt b sequence divergence between, as opposed to within, almost all basins, resulting in relatively high average values of nucleotide divergence and high ΦST values. A hierarchical AMOVA indicated that almost the entire mtDNA variation detected (90%) was due to among-groups variance, with only a small part due to within-population variation. Low levels of variability and high levels of between-basin differentiation are congruent with suggestions from previous allozyme studies of C. lusitanicum (Alves & Coelho, 1994; Coelho et al., 1997), and fit the general trend in European cyprinids (Hänfling & Brandl, 1998a): low genetic variability within and high genetic divergence between populations for species with restricted distribution; high genetic variability within and low genetic divergence between populations for widespread species.

Not surprisingly, the RFLP analysis underestimated haplotype diversity compared to the direct sequence analysis, and therefore also underestimated within-sample nucleotide diversity. The RFLP analysis did however, produce comparable results to the sequence analysis in terms of estimating between-sample diversity, and for patterns of phylogeographic structuring in C. lusitanicum (Fig. 2) divergence estimates between samples were strongly correlated with those from the cyt b analysis. It would appear therefore that a simple RFLP analysis might be an acceptable option for determining relative population structuring in freshwater fish such as C. lusitanicum, as opposed to the contradictory results sometimes produced by such analyses in marine fish (e.g. Carr & Marshall, 1991).

Particularly strong genetic divergence of the southern basins, Mira and Arade, was apparent, with high values of nucleotide divergence and pairwise sequence divergence when compared with the other samples. The level of genetic divergence observed between the Mira/Arade and the more northern basins (cyt b sequence divergence of 5.3–6.3%) could suggest a distinct taxonomic status for the resident southern populations, possibly even to species level, as suggested by previous allozyme studies [Cavalli-Sforza & Edwards (1967) chord distance= 0.278–0.428, and the presence of two fixed allele differences at the PGDH locus — Coelho et al., 1997]. Such a proposal is also supported by several meristic characters, such as number of scales in the lateral line and number of gill rakers, which are higher in fish from the southern basins than on average for the Portuguese nase (Collares-Pereira, 1983). More accurate morphological and osteological studies will be needed for the confirmation and definition of a new species. A similar situation is seen in another Iberian cyprinid, Leuciscus pyrenaicus, for which two new species have recently been described (Coelho et al., 1998) from the same basins: L. torgalensis in Mira; and L. aradensis in Arade. These two species are morphologically very similar, and also with L. pyrenaicus, but show high levels of genetic divergence at allozyme loci [Cavalli-Sforza & Edwards (1967) chord distance=0.396–0.563 — Coelho et al., 1995] and levels of cyt b sequence divergence within those observed for distinct species of freshwater fishes (5.1–10.7%, K-2 distance — Brito et al., 1997). The lack of divergence between the Arade and Mira populations of C. lusitanicum may be due to river capture (and therefore gene flow) occurring between the two drainages: C. lusitanicum occurs in the headstreams where capture takes place whereas Leuciscus does not (unpublished data and Magalhães, personal communication).

The pattern of genetic divergence of C. lusitanicum populations may be related to geological events: the Tejo and Sado drainages are thought to have been connected until the Pleistocene (Azevedo & Cabral, 1986), whereas the formation of Serra-do-Caldeirão, the mountains enclosing the Mira and Arade basins, commenced during early Pliocene (Feio, 1952). Besides such geological events, the population structure of C. lusitanicum, and in particular the high differentiation of the Mira and Arade populations, has been interpreted in the light of recent or even present bottlenecks in population sizes (Coelho et al., 1997). The ecological conditions related to the particular hydrological regimes of these intermittent rivers results periodically in high fish mortality. Under such conditions, mtDNA would be expected to show even lower levels of genetic variability in such populations, especially considering the further predicted reduction in effective population size by unbalanced sex ratios (2 males : 1 female). Nevertheless the Mira population possessed the highest levels of nucleotide diversity (although still extremely low compared to between-sample diversity). As suggested to explain the negative Tajima D-values (see above), despite high mortality in the summer, C. lusitanicum is one of the two most common cyprinid species in the Mira basin (Beja, 1995), with a continuous distribution and a population size that is most likely considerably higher than those of the other basins. Rapid recovery of population size after summer bottlenecks reduces their effect on haplotype diversity and promotes the production of groups of closely related genotypes.

Other examples of genetic studies in freshwater fishes, such as those of Poecilia reticulata (Carvalho et al., 1991; Shaw et al., 1991, 1994) and Cottus gobio (Hänfling & Brandl, 1998b, c), using allozymes, have demonstrated similarly low levels of intrapopulation genetic diversity combined with high genetic divergence among populations. Data on the mtDNA ND-5/6 genes of P. reticulata also indicated extremely low levels of intrapopulation diversity, with different alleles or haplotypes fixed in different drainages, that are thought to result from the stochastic effects of local extinctions and founder events (Carvalho et al., 1996; Shaw et al., unpublished data).

The nucleotide divergence clustering analyses for both cyt b sequencing and ND-5/6 RFLP data (Fig. 2), and the cyt b phylogenetic analysis (Fig. 3) all indicated pronounced phylogeographic structuring of C. lusitanicum, with the definition of one (Tejo/Samarra/Sado — Mira/Arade), or two (Tejo/Samarra — Sado — Mira/Arade) major genetic divisions. An AMOVA analysis indicated that with only one major division (2-groups), between-groups variation represented 78%–79% of total variance, whilst separating the Sado drainage (3-groups) increased the between-groups component to 91%–95%. This maximization of the ΦCT value, with an associated extreme reduction in the among-basins within-groups variance to a similar value as the within-basins variance, indicates that three major groupings probably represents the natural phylogeographic structure of the species, where genetic similarities coincide with the geographical distribution of basins.

For an endangered species such as C. lusitanicum, where the identification of critical habitats, establishment of territorial jurisdiction and the evaluation of population changes are needed, it is extremely important that an accurate definition of distinct population units is achieved. Different genetic units suggested for conservation purposes have been based on the degree of evolutionary divergence among populations (Moritz, 1994a, b): Evolutionary Significant Units (ESUs — Ryder, 1986), defined as populations that are highly differentiated genetically, and between which sufficient time has elapsed since divergence such that they are reciprocally monophyletic; and Management Units (MUs), described as more recently diverged populations between which there may be some current gene flow, and that differ primarily in genotype frequencies. The pattern of phylogeographic structure in C. lusitanicum suggests that all river basin populations, with the exception of the combined two southern basins (Mira and Arade), can be considered as ESUs, being reciprocally monophyletic and presenting high and highly significant ΦST values. The Tejo and Samarra populations, whilst possessing very closely related haplotypes, still exhibit reciprocal monophyly and share no cyt b sequence or ND-5/6 RFLP haplotypes. The Mira and Arade populations, although exhibiting high pairwise ΦST values (which were, however, nonsignificant for the cyt b sequence data) were not defined monophyletic groups (Fig. 3), so should not be considered as independent ESUs because they were most likely separated relatively recently. Nevertheless, the high pairwise ΦST values (which were significant for the ND-5/6 RFLP data), and the consequent indication of lack of current gene flow between these basins indicate a degree of population differentiation that suggests that the Mira and Arade populations should be classified as separate MUs.

In conclusion, the C. lusitanicum populations of all five river basins sampled in this study should be managed separately for the conservation of biodiversity in this species, with four areas (Tejo, Samarra, Sado and Mira/Arade) representing ESUs, and the southern populations (Mira/Arade) possibly warranting taxonomic re-classification. Considering the presence of species of two different genera, in the same area, displaying similar phylogeographic genetic structuring, biodiversity management in this area of southern Portugal should be undertaken at a geographical scale, taking into account the historical and current biogeographical dichotomies between basins.