Introduction

Biodiversity conservation is an essential challenge for the modern plant science. Plant species suffer from shrinking of favorable areas due to either anthropogenic impact or exclusion by superior alien species, that broadly invade disturbed habitats1. During last years the rate of an invasion process increases steadily2,3,4. An expansion of alien species into new territories, either caused or facilitated by human activity, became a global phenomenon, that damages indigenous ecosystems and negatively affects population characteristics variety, community biodiversity, and ecosystem services5,6,7.

Garden lupine Lupinus polyphyllus Lindl. is an herbaceous biennial or short-span perennial plant 0.8–1 m height. It has multiple (up to 80 pcs) flowers of blue shades, or less often pink, purple or white coloration, gathered in an upright terminal inflorescence. The native range of L. polyphyllus occupies the west of North America – Canada (British Columbia) and USA (Alaska, western Oregon and Washington, northern California). The species inhabits river banks, meadows, as well as roadside and other kinds of disturbed habitats8,9.

In Europe, lupine is known since the sixteenth century10 and originally was introduced as a garden plant. As well, it was raised for soil melioration and consolidation, especially after road building or deforestation, as well as for fertilization of arable lands and for fodder11. At present, L. polyphyllus is widespread in Western Russia at East European Plain and is included in the national list of the most dangerous invasive species (Top–100)12. In the center of European part of Russia, L. polyphyllus broadly invades natural meadow and forest plant communities with different degree of closing of leaf canopy13.

Lower genetic variability in a secondary range, in comparison with a native range, is a well-known phenomenon for populations of alien species, both plants and animals. For examples, by inter-simple sequence repeat (ISSR) markers, it was demonstrated at Fabaceae tree species Acacia longifolia14 and the ant species Linepithema humile15. In addition, the alien species with lower inter-population variability in their secondary ranges tend to reveal higher invasion success16,17.

Despite the evidence of high invasion activity of L. polyphyllus, only a few studies of its genetic variability in the secondary range were known18.

In our previous study19 we have demonstrated low but significant overall genetic diversity of L. polyphyllus at northern limit of its secondary range, both at inter- and intra-population levels. The study from more southerly parts of the species’ secondary range from European part of Russia20 demonstrated higher intra-population (rather than inter-population) variability of L. polyphyllus.

In the present paper we aimed to extent the geographical scope of the previous study and to investigate molecular genetic diversity of L. polyphyllus from different parts of its secondary distribution range at East European Plain, as well as to describe its contribution to natural vegetation there.

Results and discussion

Nuclear ribosomal internal transcribed spacer sequences (ITS1–2) demonstrated the absence of variability and therefore are not described further. For Lupinus genus, the ITS sequences were previously known as informative markers of inter-species (within genus) variation21. At intra-specific level, Skorupski with co-authors found no differences between invasive populations of Lupinus nootkatensis in Iceland by ITS222. In another botanical family at intra-species level, ITS sequences of Passiflora alata demonstrated even higher variability, than cpDNA sequences did23. Our study revealed, that ITS1–2 sequences are non-informative markers of interpopulation variability for L. polyphyllus.

A phylogenetic tree, based on chloroplast intergenic non-coding spacer rpl32–trnL sequences, is shown in Fig. 1. In this tree, two major clades with high bootstrap support values (89 and 90%) are distinguished, with remarkably different number of specimens (84 and 6).

Fig. 1
figure 1

The phylogenetic tree of studied specimens of L. polyphyllus based on chloroplast intergenic non-coding spacer rpl32–trnL site. Colors of circles next to sample names correspond with petal coloration. First letters in a sample (specimen) name designate a population, and subsequent symbols (digits with low case letters) code a specific individual. In addition, population locations are specified by regions of Russia, barring those from Belarus and Canada. Genetically identical specimens are merged into common branches of the tree. The absence of geographical patterns within the tree allows to suggest multiple events of human-induced introduction.

Within clade I, 37 out of total 84 specimens were grouped in a subclade with relatively high bootstrap support value (59%). In addition, a sole specimen from location LP1 (St. Petersburg, Russia) was segregated from the rest 46 specimens within clade I. No geographical pattern was found within the clade I: the specimens from different parts of the secondary range were presented in the major subclade as well as among the rest 46 samples, including specimens out the boundaries of East European Plain and from the native range (west of North America).

The clade II comprises of 6 specimens from western (Belarus), central (Orel and Kaluga regions of Russia), and northern (Kola Peninsula – Murmansk region) parts of East European Plain (Fig. 1). These samples possessed the lowest number of nucleotide substitutions in relation to the outgroup (L. albus).

No relation between the petal coloration and the membership of a specific clade was found. Geographically distant specimens from Siberia (LNsk, LBur1, LBur2) and Canada (LAm1 и LAm2) did not form a separate clade and were also merged within different subclades of clade I.

Five studied populations demonstrated the sufficient level of intrapopulation variability comparable with the level of interpopulation differences. Specifically, specimens from LA location, LVlad2 location, and LV4 location were dispersed between two major subclades within clade I (Fig. 1). Specimens from LM1–4 and LK3 locations possessed even stronger intrapopulation differences and were dispersed among two major clades (I and II, Fig. 1).

The examples when intrapopulation variability is as high as interpopulation variability (or even higher) are common for the alien species, which escaped from agriculture and then became the invasive species (e.g., Passiflora alata23, Pueraria lobata24).

Several groups of studied populations can be considered as “neighboring” ones. Thus, two populations from locations LM1–4 and LM5 occurred at distance ca. 200 m from one another, but were disrupted by a woody brook valley. Similarly, the sample LCh4 and a group of specimens in location LCh1–3 were situated ca. 800 m from one another and were disrupted by a local woodland. Finally, two locations LKu and LBmo (Supplementary Table S1) had a distance ca. 1.2 km from one another and were disunited by a residential area. Noticeable, that the specimens from the both former couples of populations (LM1–4 and LM5; LCh1–3 and LCh4) were all dispersed between two major subclades of clade I. On the contrary, the latter couple of population (LKu and LBmo) were merged into the single subclade. Therefore, two of three “neighboring” groups of populations demonstrated sufficient inter-population variability at the lowest spatial scale.

Perhaps, broader geographical scale of our study allowed us to discover an existence of opposite spatial patterns in genetic variability of L. polyphyllus. Specifically, we detected conspicuous intra-population polymorphism in all studied parts of the secondary range (northern, central, and southern ones), that extend our earlier suggestions19 on intensity of microevolution process. At the lowest spatial scale of interpopulation variability, two of three groups of neighboring populations showed sufficient polymorphism. Previously, much lower level of interpopulation variability, in comparison with intrapopulation level, was reported for L. polyphyllus from Lithuania (west of East European Plain)25. As well, those authors did not reveal any spatial patterns in inter-population genetic variability. The latter is consistent with the findings in the present study.

The haplotype network obtained from the analysis of rpl32–trnL fragments (Fig. 2) was perfectly consistent with the above described phylogenetic tree for the same site. The haplotype network was formed by four haplotypes, three of which (1, 1A, 1B) were very similar to one another and corresponded to the clade I in the phylogenetic tree. Two of these haplotypes (1 and 1A), embraced the most of studied specimens from all regions, were coupled with the sole specimen from location LP1 (St. Petersburg, Russia), which formed the third haplotype 1B. Haplotype 2 corresponded to the clade II in the phylogenetic tree and was a sister haplotype in relation to the rest three haplotypes (Fig. 2). No certain pattern (Fig. 3) was revealed in geographical distribution of the haplotypes, but the scarcity of data outside Europe does not allow to interpret this result in more details. This scarcity is especially crucial for the native range and does not allow to discuss the initial steps of invasion history. The scarcity of the data from the Asian part of the secondary range originates mainly from the fact of lower occurrence of L. polyphyllus there and therefore requires intensive field studies to close this gap.

Fig. 2
figure 2

A haplotype network on the base of chloroplast intergenic non-coding spacer rpl32–trnL analysis of L. polyphyllus specimens from different parts of the species area. An ancestral haplotype (pointed by arrow) is absent among studied specimens. Haplotype numeration in this figure is consistent with clade numeration in Fig. 1.

Fig. 3
figure 3

Geographical distribution of studied specimen haplotypes of L. polyphyllus. Haplotype designations are the same as in Fig. 2. (Made with QGIS v. 3.28.11, https://www.qgis.org/).

Using PCR, 57 inter-simple sequence repeat (ISSR) markers were obtained, of which 47 were informative. The results of specimen ordination on the base of the ISSR markers revealed the large extent of both intra- and interpopulation variability of L. polyphyllus (Fig. 4). Broad dispersion of individuals from most of local populations in the ordination space can be treated as an evidence for high intra-population variability. The specimen membership of a specific local population was the only significant difference across all accessible factors. In the ANOSIM test, specimen differentiation was significant (though not high) among the local populations (R = 0.33, p < 0.01) as well as among the enlarged locations consisting of the neighboring populations (R = 0.26, p < 0.01). No relation with geographical (latitudinal) position was found either in the ordination space or in the ANOSIM test (p > 0.05). In addition, neither the membership in a certain subclade (Fig. 1) nor the corolla coloration per se were the significant factors of differentiation of specimens (p > 0.05 for each factor in the ANOSIM test).

Fig. 4
figure 4

Sample ordination plot of nonmetric multidimensional scaling results, based on Jaccard distance. Each point represents a specimen; point color matches specimen petal coloration. Labels designate the population location codes. No clear patterns in specimen differentiation can be found by petal coloration. Specimens within most populations demonstrate broad dispersion in the ordination space, that suggests a high level of intra-population variability. The plot is produced with R “ggplot2” package, v. 3.4.0 43.

The Nei–Li distances (Supplementary Table S4), calculated at binary ISSR data, were comparable at within- (0.248 ± 0.031) and between-population (0.226 ± 0.044) levels (mean ± SD). For populations of L. polyphyllus from Finlad, Li with co-authors26 found that intrapopulation variability exceeded the interpopulation level. But, like us, in their study authors also did not reveal a latitudinal pattern in genetic variation of L. polyphyllus. The extensive data allowed us to identify two levels of inter-population variability in the present study: variability among locations (the mode previously reported by other researchers25,26) and variability among enlarged locations consisting of the neighboring populations.

To assess the invasion process activity, we investigated plant communities where garden lupine specimens were detected (Supplementary Table S5). Abundance of L. polyphyllus in terms of projective cover was higher in central regions (45–50%) rather than in both southerly and northerly regions (< 30%) of East European Plain. It was reported, that L. polyphyllus occupies a certain ecological niche soon after its introduction into local native vegetation and does not change its influence on the community after many years27. Therefore, the revealed difference in the lupine projective cover in the plant communities in studied regions is deemed to be cause mainly by abiotic factors rather than by population age or competitive relationships between plant species. In southern and northern regions, cover did not exceed 25–30%, that allows to suggest less favorable environmental conditions for L. polyphyllus at both northern (low temperatures) and southern (precipitation or soil moisture lacks) limits of its secondary distribution range, which coincides with the data obtained in Central Europe18.

In southern and central regions of the studied area, all investigated populations inhabited open biotopes with meadow vegetation. In the northern regions, L. polyphyllus occurred either at forest gaps (location LL) or, in locations LV1–LV4, in meadow communities and forest communities with low degree of closing of leaf canopy (10–15%).

In all investigated regions, families Poaceae and Asteraceae were the most common floristic components of communities. In addition, noticeable components of community floristic composition were Salicaceae in central part and Fabaceae in southern part of the studied area. L. polyphyllus is known to its capability to form monodominant stands6,28. We did not detect garden lupine monodominant stands in investigated plots, though some of them can be characterized as oligomixic communities with sufficient prevalence of only several dominant species.

Overall, despite a plot vegetation type, all investigated populations were found at anthropogenically disturbed habitats, with present or former settlement activity. This finding is not surprising: in general, disturbed habitats are the established “hot spots” of alien species dispersion1. On the other hand, L. polyphyllus remains a popular garden plant to this day, and it can facilitate the invasions of other alien species via its priority effect29.

Conclusions

Our study revealed that ITS1–2 sequences were non-informative markers of intra-species variability for L. polyphyllus.

In the phylogenetic tree on the base of chloroplast rpl32–trnL sequences the sufficient level of intra-population variability was detected. No geographical or petal coloration patterns were found between two main clades of the tree.

Two sister haplotypes were displayed in the haplotype network on the base of chloroplast rpl32–trnL, one of which included individuals from the species’ native range (west of North America). No geographical pattern in haplotype distribution at East European Plain was revealed that suggests multiple introductions of L. polyphyllus from different sources.

ISSR sequence data demonstrated comparable levels of within- (0.248 ± 0.031) and between-population (0.226 ± 0.044) variability (mean ± SD, Nei–Li distances). Statistically significant interpopulation variability was detected at two spatial scales: among the local populations and among the enlarged locations consisting of the neighboring populations.

All investigated populations of L. polyphyllus at East European Plain invaded anthropogenically disturbed habitats, with present or former settlement activity. The highest projective cover of L. polyphyllus was registered in central regions of studied area, which are characterized by milder climatic conditions in comparison with more southerly and northerly regions. Ongoing aridization of the climate of East European Plain (increasing of average annual temperature and shift in precipitation regime during a growing season) suggests decreasing of abundance of L. polyphyllus in southern part of its secondary range. Nevertheless, the revealed genetic variability of specimens at the lowest spatial scale both at intra- and interpopulation levels may be a cue for a high invasion potential of L. polyphyllus in all areas of the studied secondary range at ongoing climatic changes.

The scarcity of data from the native range impeded the interpretation of haplotype network in terms of invasion history in the present study – more genetic data available from the public repository can be applied in the future studies. Low number of herbarium records of L. polyphyllus from the Asian part of its secondary range requires more intensive field researches, that represent the perspectives of the current study, together with monitoring of changes in genetic variability over time in already investigated territories.

Materials and methods

Field research and herbarium materials

The search for L. polyphyllus occurrences at East European Plain was carried out using a route-based method. For the investigation, nine largest populations, detected in different sites of southern, central, and northern parts of the species’ secondary range at East European Plain were chosen. In total 41 specimens were sampled (Table 1 and Supplementary Table S1), species identification was undertaken by collectors (Supplementary Table S1). Field material was dried using silica gel for subsequent DNA extraction. The voucher specimens were lodged at MHA herbarium (Supplementary Table S1). Examples of petal coloration are shown in Supplementary Fig. S1. In sites, where garden lupine was located, we conducted vegetation descriptions (sample plot size 10 × 10 m) using the standard geobotanical methods30.

Table 1 Summary on field and herbarium specimens and previously published data, involved in the study.

To clarify phylogenetic relationships, additional material on 26 herbarium specimens from East European Plain, Siberia, and North America was included from herbaria LE, MHA, and MW (Table 1, Supplementary Table S2). Figure 5 summarize the field and herbarium material involved in the present study.

Fig. 5
figure 5

Occurrences of the specimens in the present study across the World. (Made with QGIS v. 3.28.11, https://www.qgis.org/).

Molecular data

DNA was extracted from silica gel dried leaves and leaves of herbarium samples. Since the nuclear ribosomal internal transcribed spacer 1–2 (ITS1–2) turned out non-informative for the studied species, the highly variable chloroplast marker rpl32–trnL was also investigated. For the ITS1–2 region the primers nnc18s10 (forward) and c26A (reverse) were used. For rpl32–trnL spacer the primers rpl32F (forward) and trnL UAG (reverse) were used. A total of 35 sequences from ITS site and 67 sequences from rpl32–trnL site were obtained for further data analysis. The data were submitted to GenBank31, in which these nucleotide sequences can be found by their accession numbers (Supplementary Tables S1 and S2).

The ISSR sequences were chosen as non-plastid markers. Six ISSR primers were used, namely UBC55 [(ACACAC)2ACACCYT], M1 [(AC)8CG], M2 [(AC)8(C/T)G], M11 [(CACACA)2(A/G)], M12 [(CA)6RY], 17899a [(AC)6AG], to obtain sequences from specimens collected in the field (Table 1). To determine DNA fragment lengths, a 100 bp + molecular weight marker was used. Images with DNA fragments (Supplementary Fig. S2) have been analyzed using CrossChecker software32, with compilation of binary matrices of the presence/absence of fragments of the same length. A material from 38 specimens was obtained for further analysis of ISSR primer sequences.

Data analysis

Besides the samples gained from the field and herbarium material de novo (Supplementary Tables S1 and S2), in the data analysis were also involved previously reported19 data on L. polyphyllus specimens from the Veps Forest Nature Park and from MHA and LE herbaria (Supplementary Table S3): 16 samples for ITS site and 23 samples for rpl32–trnL site. Therewith, total of 51 and 90 sequences respectively for each site were obtained for reconstruction of phylogenetic relationships (Table 1).

Sequences were checked and manually edited and aligned using BioEdit v. 7.0.5.3. program33. All alignments were built from consensus sequences obtained by direct sequencing of PCR products. We paid special attention to careful examination of electrophoregrams to identify sites with nucleotide substitutions. The evolutionary history was inferred by using the Maximum Likelihood method and Tamura-Nei model34. The tree with the highest log likelihood (-586.68) was chosen. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Tamura-Nei model, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. There were a total of 401 positions in the final dataset. Evolutionary analyses were conducted in MEGA1135. As an outgroup for the phylogenetic analysis we used data on Lupinus albus L., gained form GenBank (accession number NC026681.1)36. To construct the haplotype networks for rpl32–trnL site, we used TCS 1.21 program37.

The ISSR data were available for field material of 2023 only (Table 1). The data were presented in the form of a matrix of binary features, in which the presence or absence of a certain fragment has been coded as 1 or 0, respectively. From these binary data, the average distances among the plants within each population were calculated. For calculations we used Nei–Li coefficient38, which is algebraically equivalent to Dice (Sørensen) similarity measure39. For a consistency, inter-population distances were calculated in the same manner, but with the preliminary summation for the marker prevalence in each population. All calculations were conducted in R package “ade4”40.

For visual exploration of overall genetic variability by ISSR sequences, we involved the method of non-metric multidimensional scaling (nMDS) on Jaccard distance. To test the significance of differences among specimens by different characters, we used ANOSIM test on Jaccard distances. By means of a permutation procedure, ANOSIM test allows to estimate the power and significance of observed differences among groups of objects (the latter were specimens in this study). Specifically, we checked the significance of inter-population differences, when specimens were grouped by populations. Both analysis were conducted with R package “vegan”41,42.