Introduction

The enormous karyotype variation in the common shrew, Sorex araneus has been extensively studied for several decades. So far, more than 50 chromosome races have been described (Zima et al, 1996; Bulatova et al, 2000; Mishta et al, 2000; Polyakov et al, 2000) and the list of the S. araneus races is probably still not completed. The chromosome races of the S. araneus species are characterized by different sets of acrocentrics and metacentrics. As the ancestral karyotype for S. araneus was most probably acrocentric (Wójcik and Searle, 1988), Robertsonian (centric) fusions leading to metacentric state must have occurred and become fixed. The conditions under which a fixation of new Rb fusion could have occurred, can be summarized in two most probable models. First is the chain processes variant of White's (1978) stasipatric model of chromosomal evolution. The distribution of metacentrics and chromosome races in Poland fits White's stasipatric model remarkably well (Wójcik, 1993). Different metacentrics (arising from Rb fusions) apparently have spread different distances into an ancestral acrocentric distribution (Wójcik, 1993). Two factors give evidence for this model of local fixation of Rb metacentrics in the common shrew. First, simple Rb heterozygotes in the common shrew show little disadvantage in comparison to homozygotes (Searle, 1993). Second, Wyttenbach et al (1998) showed meiotic drive in favour of some metacentrics in the male common shrews.

On the other hand, the simplest model for local fixation is by genetic drift only. At present, no evidence for genetic drift comes from allozyme (Bengtsson and Frykman, 1990) or microsatellites (Wyttenbach and Hausser, 1996) in studies of S. araneus. However, the conditions for local fixation of metacentrics could have been suitable in the past. At the end of glaciation, small populations were probably formed at the leading edge of expansion by constant range changes. Such bottlenecked populations could have been the sites of fixation for new chromosomal variants.

The possibility of determining past population sizes comes from molecular data collected from present-day populations. In our study we used cytochrome b gene sequence data of shrews from five chromosomal races in Poland that belong to two karyotypically distant groups: the West European Karyological Group (WEKG) and the East European Karyological Group (EEKG), see Searle (1984, Wójcik (1993, Searle and Wójcik (1998). The WEKG races in Poland are characterized by (at maximum) five different Rb metacentrics. While within the EEKG races as many as nine different fusions must have occurred (Wójcik, 1993; Fedyk, 1995). Thus, it cannot be ruled out that at least some metacentrics within EEKG could have arisen by whole arm reciprocal translocations (WARTs, see Hauffe and Pialek, 1997). It seems that WARTs may be fixed in a bottlenecked populations only, as the fertility of heterozygotes for a WART (forming at least four element meiotic complexes) is almost certainly lower than Rb heterozygotes (forming meiotic trivalents) (Searle and Wójcik, 1998). Thus, the lack of a severe bottleneck in the common shrew populations would be an indirect argument for Rb fusions, as a main type of chromosomal mutation type in S. araneus from Poland and give support for Wójcik's (1993) model of fixation of new Rb metacentrics in large continuous populations. In this paper, we determine past population sizes for S. araneus, making use of molecular data collected from present-day populations. We establish the long-term effective female population size and investigate whether chromosome races represented by Polish populations of S. araneus have experienced population bottlenecks at the time of fixation of the chromosomal variants that define them. We also include results of an allozyme study based on 47 protein loci in the common shrew populations from WEKG and EEKG in Poland, to give a wider picture of genetic variation in this area. Finally, we compare results from allozymes with the cytochrome b gene analyses.

Materials and methods

Tissue samples were collected in 1993–1994 years from 10 different S. araneus populations (Figure 1). The sampling localities were (abbreviations, coordinates, sample size for cytochrome b gene analyses): Blizocin (BLI, 51°36′N 22°10′E, n = 5), Jurowce (JUR, 53°11′N 23°07′E, n = 6), Ulaski (ULA, 52°52′N 21°06′E, n = 1), Zubrówka (ZUB, 54°05′N 23°10′E, n = 2), Ramoty (RAM, 53°57′N 19°14′E, n = 2), Kȩpki, (KEP, 54°12′N 19°18′E, n = 1), Lubstowo (LUB, 54°08′N 19°12′E, n = 1), Jabłonowiec (JAB, 51°41′N 21°44′E, n = 6), Rokitnia (ROK, 51°35′N 21°46′E, n = 2) and Łutynowo (LUT, 53°34′N 20°19′E, n = 2). Sample sizes for the electrophoretic study ranged from 12 to 42 individuals (219 individuals all together). Allozymes were not studied in the JAB population. Sampling localities BLI, JUR and ULA were located within Białowieża chromosome race (EEKG), ZUB within Gołdap race (EEKG), RAM within Łȩgucki Młyn race (EEKG), KEP and LUB within Nogat race (WEKG), while JAB, ROK and LUT within Drnholec race (WEKG).

Figure 1
figure 1

Map of the locations of the populations sampled and the distribution of the chromosome races of the common shrew. Race abbreviations the after Zima et al (1996). Race ranges after Fedyk et al (2000). Bold line = the position of contact zone between WEKG and EEKG karyotypic groups.

Chromosome preparations were done from spleen cells and stained for G-bands with Giemsa reagent after treatment with trypsin (Seabright, 1971). Chromosome arms were labelled according to the nomenclature proposed by Searle et al (1991).

DNA isolation and amplification

Total DNA was extracted from dried toes using Qiagen DNAeasy Tissue Kit following the manufacturer's instructions. Amplification took place in a 25 μl volume containing 50 μM of dNTP each, 2 mM MgCl2, PCR buffer (Qiagen), 1 μm primers (both), 1 U Taq DNA polymerase and 2 μl DNA template per tube in a Perkin Elmer thermal cycler using the following profile: initial denaturation for 3 min at 93°C, denaturation for 30 s at 93°C, annealing for 30 s at 50°C and elongation for 3 min at 72°C. The primers used to amplify the cytochrome b gene fragment were: ‘cytb L14841’ and ‘cytb H15915’ (Irwin et al, 1991). Amplified DNA was cleaned using the QIAquick PCR Purification Kit (Qiagen) following the manufacturer's instructions. PCR products were sequenced on an ABI 377 automated sequencer (using PRISMTM Ready Reaction DyeDeoxy Terminator Cycle Sequencing chemistry: ABI) according to the manufacturer instructions.

Descriptive statistics

Estimates of haplotypic diversity (h) and nucleotide diversity (π) were calculated according to Nei (1987). Relationships among different haplotypes were estimated using minimum spanning network performed with the help of the ARLEQUIN program (Version 2.0, Schneider et al, 2000).

Neutrality test for mtDNA

There is some evidence that mtDNA may be not a strictly neutral marker (William et al, 1995). As a test of neutrality of different mtDNA haplotypes, we tested whether sequence variation in different populations conforms to a neutral infinite allele distribution (Tajima, 1989). The departures from neutrality could be detected by measuring the magnitude of differences, as measured by Tajima's D statistics (1989). This test assumes no severe bottleneck and recent expansion, which would lower the values of D statistics, leading to increased probability of erroneously rejecting neutrality.

Historical demography

To study whether a reduction in population size has occurred in the past we adopted two different approaches. The estimation of the long-term effective female population size was done from the equation: Ne = [106π]/[2sg] (Avise et al, 1988), where π is the nucleotide diversity (mean number of base substitutions per nucleotide between individuals within the given population), s is the evolutionary rate (s = % substitutions per genome per million years), g is the generation time. We adopted the estimate of s = 2.5% divergence rate for the cytochrome b gene in mammals (Meyer et al, 1990) and g = 1. Another way to gain insight into the historical demography of the populations studied is the examination of the distribution of pairwise sequence differences within populations (Slatkin and Hudson, 1991; Rogers and Harpending, 1992; Rogers, 1995). This information from pairwise sequence differences (‘mismatch dis- tribution’) was used to test whether current haplotypic variation in the common shrew might fit better with expectations of the ‘equilibrium’ model (constant long term Ne) or with the ‘sudden expansion’ model which postulates a recent expansion in population size (Rogers, 1995). Moreover, parameters derived from mismatch distributions can be used to get rough estimates of three demographic parameters: effective female population size before (No) and after (N1 = current size) a hypothesized expansion, and the time (t) to the expansion in generations (Rogers and Harpending, 1992). The initial population size (No) was estimated by equating θ0 = 2 Nou, where u = 2sk, where s is the mutation rate (2.5% per Myr, see Meyer et al, 1990) and k is the length of the sequence (1023 bp). θ0 (the expected distribution of pairwise differences before expansion) was estimated as sqrt (vm), where m and v are the observed mean and variance of pairwise sequence differences. The time (τ) in units of 1/2u generations (where u is the sum of per-nucleotide mutation rate in the mtDNA region under study) was estimated as m – θ0 (Rogers, 1995). This estimation, however, cannot be done if the variance of the mismatch distribution is smaller than the mean. To overcome this, θ0 was set to zero if v < m following Rogers (1995). It should be noted that if the expected distribution of pairwise differences before expansion (θ0) was equal zero, the initial population size (No) was also zero. This does not however, mean that the ancestral population was of zero size, it rather means that before expansion all haplotypes in the population were identical. An estimate for current population size (N1) was derived from the equation: θ1 = 2N1u, where θ1 is the expected distribution of pairwise differences after expansion (Rogers and Harpending, 1992). The estimates of time since expansion were obtained from the equation: τ = 2ut where t is the time elapsed between No and N1 (Rogers and Harpending, 1992). The validity of the estimated expansion model was tested using the parametric bootstrap approach (Schneider and Excoffier, 1999). The sum of square deviations (SSD) between the observed and the expected mismatch was used as a test statistics (Schneider and Excoffier, 1999). All the calculations mentioned above were done when sample size was not less than five individuals (samples BLI, JUR and JAB). Analysis of ‘full sample’ and separate calculations for combined samples of different karyotypic groups: WEKG and EEKG were also performed. All the analyses were done with the help of the ARLEQUIN program (Version 2.0, Schneider et al, 2000).

Population structure

To test for differentiation among populations, an exact test for population differentiation (the hypothesis of a random distribution of k different haplotypes among r populations) was performed using Markov Chain Algorithm (Raymond and Rousset, 1995). Furthermore, using the cytochrome b gene data, pairwise FST values among BLI, JAB and JUR populations, races and karyotypic groups under study, were calculated. To provide a comparison of divergence between WEKG and EEKG using two types of markers (mtDNA and allozymes) we calculated Nei's (1978) genetic distance between populations of different races and karyotypic groups studied. Observed and expected heterozygosity values for the common shrew populations were also calculated to see possible loss of genetic variation due to a population bottleneck. The program BOTTLENECK (Cornuet and Luikart, 1996) was used to test for a recent reduction of effective population size. The data for 47 enzyme loci were obtained using conventional starch gel and cellulose acetate electrophoresis of allozymes (Richardson et al, 1986; Murphy et al, 1996).

Results

Characterization of cytochrome b gene sequence

The sequenced fragment of the cytochrome b gene was 1023 nucleotides long (GenBank accession numbers: AJ409867–AJ409894). Thirty-four variable positions, defining 21 distinct haplotypes were found among 28 individuals studied (Table 1). Most of substitutions were transitions (30 out of 35, TS: TV ratio was 6), like in other mammalian species, including S. araneus (Irwin et al, 1991; Taberlet et al, 1994). As in the other study of cytochrome b gene of six Sorex species (Fumagalli et al, 1996), we found for the light strand of cyt b gene a deficiency of guanine (13.70%); the other nucleotides were more balanced (29.49% thymine, 27.96% cytosine and 28.85% adenosine). The mean pairwise nucleotide divergence (π) among the haplotypes was 0.0043, SD ± 0.002, n = 21) and π = 0.0036, SD ± 0.002 among all individuals in the entire sample (k = 378 comparisons). Assuming a rate of 2.5% divergence for the cytochrome b gene in mammals (Meyer et al, 1990), a divergence rate per Myr, then the divergence equal to 0.88% between the two most divergent haplotypes indicates that the common ancestor of all current S. araneus haplotypes existed about 350 000 years ago.

Table 1 Cytochrome b haplotypes found in 10 Sorex araneus populations studied. For population abbreviations and their chromosome constitution see Materials and methods

Interpopulation variation of the haplotypes

The most common haplotype was found in only four out of 10 populations studied and the majority of haplotypes (18 out of 21) were represented only once (Table 1). The most common haplotype was present both, in EEKG and WEKG, however in WEKG it was found only once. Haplotype diversity estimates within the populations sampled were high (h = 0.800–0.928), while nucleotide diversity estimates were low (π = 0.0034–0.0053 ± 0.002, Table 2). The question arises whether the haplotype and nucleotide diversity estimates can be explained by selection or historical demography of populations.

Table 2 Cytochrome b haplotype (h) and nucleotide (π) diversities, Taijma (1989) D statistics and estimated long-term effective female population size (Ne) for Sorex araneus samples

The Taijma's (1989) test of neutrality was performed. The distribution of haplotypic variability within populations, races and karyotypic groups was compared to that expected under infinite-site model without recombination. The hypothesis of neutrality was not rejected in the BLI, JUR and the JAB samples. We rejected the null hypothesis of neutrality in the combined samples: of the Białowieza race, EEKG and WEKG karyotypic groups and in a pooled sample of all specimens (Table 2). The rejection of neutrality, however may be due to factors other than selection acting on haplotypes, for example population expansion, which in fact is supported by significantly negative Tajima's D values (Table 2). Thus, we also performed analyses to evidence possible historical explanations.

The long-term effective population size (Ne) calculated using mean pairwise divergence and evolutionary rate (2.5% per Myr) for the pooled sample was 70 000. Ne values for EEKG and WEKG were 68 000 and 74 000, respectively (Table 2). We also estimated the effective female population size after expansion (N1, Table 3), using the method of Rogers and Harpending (1992). The N1 values from Table 3 seem to be overestimated, however it is evident, they are remarkably high, indicating considerable increase of population size during population expansion. Thus, no recent population bottleneck was probably present after expansion of S. araneus populations, after the last glaciation in Poland. For the groups studied, the observed distribution of pairwise sequence differences (the mismatch distribution) was unimodal and followed the expected distribution of a growing population (see Figure 3 for EEKG as an example). The time since the beginning of expansion (t = 37 400–62 400 years ago, Table 3), estimated according to Rogers and Harpending (1992), coincides with the last glacial period.

Table 3 Results of pairwise sequence difference showing approximate time to the beginning of the expansion (t), effective female population size at the beginning (N0) and at the end (= present day) of expansion (N1), and the probability (P) that the simulated sum of squared deviations is greater or equal than the observed sum of squared deviations (the fit to the predicted expansion scenario)
Figure 3
figure 3

The observed and expected distribution of pairwise differences under the model of population expansion (Rogers and Harpending, 1992). The parameters used were for the EEKG group of the common shrew: τ = 3.856, θ0 = 0.126 and θ1 = 16.977.

Population differentiation

We found no statistically significant differentiation in the cytochrome b gene between karyotypic groups (group pairwise differentiation test for non-differentiation, NS); races (NS) and populations (NS). All the FST values were equal to zero. Hence, due to low degree of absolute divergence and fairly short time to common ancestry no population subdivision was found in S. araneus studied. The lack of phylogeographical structure is also evident from the minimum spanning network (Figure 2), which seems to represent a ‘sudden expansion’ model.

Figure 2
figure 2

The minimum spanning network obtained for nine populations of the common shrew in Poland. For abbreviations see Materials and methods. The small black nodes refer to one substitution. The alternative links between haplotypes indicated with an asterisk (*).

Allozyme analysis

Eight loci were found to be polymorphic with two or more alleles in at least one population: aminoacylase, esterase-1, esterase-D, isocitric dehydrogenase-2, lactate dehydrogenase-2, mannosephosphate isomerase, phosphoglucomutase-1 and phosphoglucomutase-3. The other loci were monomorphic. None of the alleles was diagnostic for any race or karyotypic group. The full list of loci and allele frequencies are available from the authors upon request. Heterozygosity values (both observed and expected) were similar in all the populations studied and did not differ substantially (Table 4). Tests for a population bottleneck revealed no evidence for a recent reduction in effective population size in any sampling area (Table 4). The values of Nei's (1978) genetic distances (D) among populations studied were low. Values of D among populations within EEKG varied from 0.000 to 0.006, within WEKG D values were from 0.000 to 0.009, with average D value between WEKG and EEKG 0.003 (0.000 – 0.006), showing no divergence between WEKG and EEKG on allozyme loci.

Table 4 Observed (Ho) and expected (He) heterozygosity values calculated over 47 allozyme loci for nine Sorex araneus populations studied (standard error in parentheses). The probability (P) that the populations studied fit to mutation-drift equilibrium is also given as computed using BOTTLENECK (Wilcoxon test; Cornuet and Luikart, 1996)

Discussion

Selection on mtDNA and demographic explanations

We did not notice an increased frequency of any haplotype in any chromosome race or karyotypic group, what could be indicative for the process of positive selection favouring the common haplotype. Surprisingly, the test of neutrality was rejected in the case of Białowieża race, WEKG, EEKG and in a pooled sample (Table 2). This, however, could be expected if the populations under study have not been of constant size for considerable periods of time. In such a case, the approach cannot strictly separate selective and demographic explanations (Tajima, 1989; William et al, 1995). An excess of singletons may be an evidence for expanding populations (William et al, 1995). Indeed, the expansion scenario is supported by significantly negative Tajima's D values (Table 2) and the star-like haplotype network (Figure 2).

The observed distribution of pairwise sequence differences (the mismatch distribution) was unimodal (Figure 3), and followed the expected distribution of a growing population (Rogers and Harpending, 1992). The long-term effective female population sizes (Ne) estimated from nucleotide diversity (π) seem to be relatively high (Table 2), indicating that the population size probably has not dropped to a few individuals since the last glaciation. Furthermore, the estimates of present-day female population sizes (N1), based on mismatch distribution of pairwise sequence differences (Rogers and Harpending, 1992) were around tens of millions of individuals (Table 3). The N1 values from Table 3 are probably overestimated and should be regarded as rough approximations. Hence, although there is a large uncertainty on true values of estimated parameters, calculated N1 values (Table 3) indicate very large present-day female population sizes. It can be assumed, for the common shrew, that when the glacial period ended, the species rapidly occupied new habitats and the population size increased. This is in agreement with ecological data: in favourable habitats S. araneus presumably exists in large, continuous populations (Croin-Michielsen, 1966). Since mismatch distributions were not strongly L-shaped (left truncated), which would be rather typical for strongly bottlenecked populations (Marjoram and Donnley, 1994), we assume there has been no population bottleneck for S. araneus populations since last glaciation.

Concordance between molecular data and models of chromosomal evolution

The ‘sudden expansion’ model suggested in our study of the cyt b gene of the common shrew is in agreement with the chain variant of the stasipatric model of chromosome evolution (White, 1978) proposed for chromosome races in Poland (Wójcik, 1993). The race-diagnostic metacentrics may have arisen within the species range by the process of Rb fusions during expansion through Europe. According to Wójcik (1993) the area of Poland was populated from the southwest by shrews possessing jl and hi metacentrics, while shrews with jl and gr were colonizing this area from the east during early post-glacial period. After that different metacentrics subsequently spread over populations (Wójcik, 1993). Our data from the cytochrome b gene and allozyme variation seem to favour the model assuming that Rb fusions rather than WARTs played main role in the chromosome evolution of the common shrew in Poland. The lack of a population bottleneck probably rules out the possibility of WARTs in Polish chromosome races of the common shrew. This is because WARTs may be fixed in bottlenecked populations only (Searle and Wójcik, 1998). This does not, however, mean that no shrew metacentrics have been formed by WARTs. As the distribution and chromosomal relationships of some races in Finland, Sweden and in Siberia could be most easily explained using WARTs (Halkka et al, 1987; Fredga, 1996; Polyakov et al, 2000), it would be interesting to use molecular markers to study whether populations belonging to those races were bottlenecked or not in the past.

Divergence between karyotypic groups on molecular ground

In the study of cytochrome b gene of S. araneus we showed somewhat intriguing picture of population structure. The two aspects of our study seem to be most striking: the lack of ancient divergence between two karyotypic groups: EEKG and WEKG of S. araneus on molecular grounds and the star-like phylogeny of different haplotypes (Figure 3). We found only slight differences in haplotype frequencies among different populations, despite the fact that they represent different karyotypic groups and chromosome races. The biggest pairwise nucleotide divergence among the shrews studied was 0.88%, despite the fact that they represent different karyotypic groups and chromosome races. This does not, however, rule out the possibility of the common shrews from WEKG and EEKG having survived in different refuges. Unfortunately, the cytochrome b gene does not seem to be an appropriate marker to distinguish such alternatives. Bilton et al (1998) showed that the same branch of NJ tree of cyt b haplotypes links together individuals from a huge geographical area extending from Western Europe to Eastern Siberia, although most certainly they did not survive the last glaciation in a single refugium (see Searle and Wójcik, 1998). The estimated time to the common ancestor of all current S. araneus haplotypes (350 000 years ago) is certainly too long for the last recolonization of Poland. To explain this, two hypotheses may be considered: (1) the molecular clock is faster in Sorex, or (2) S. araneus did not suffer from a bottleneck before the last recolonization. At the moment, it is not possible to say, which hypothesis is more probable. It should be noted, however, that evolutionary rate of mtDNA in Insectivora may show 1.6-fold increase when compared to the slowest evolving orders of mammmals (Grissi et al, 2000). If this holds true for Sorex, all the haplotypes studied coalescence to the common ancestor that lived around 220 000 years ago.

The lack of population structuring in present-day populations most probably reflects the retention of shared ancestral polymorphism spread over large geographical areas by recent population expansion. The expansion probably started about 37 400 to 62 400 years ago (Table 3) and it coincides with the last glacial period. Thus, during expansion process through Europe different chromosome races may have arisen within the species range.

Although the chromosomal differences between distinct races of the S. araneus in Poland are evident, the pattern of genetic variation between karyotypic groups revealed for the cyt b gene does not represent secondary contact between previously isolated populations. Thus, the results from the cytochrome b gene studies are in an agreement with the hypothesis of the recentness of the chromosomal variation in the common shrew (Taberlet et al, 1994).