Introduction

Amaranthus (Family Amaranthaceae) is a cosmopolitan genus with at least 70 species, including ancient cultivated plants such as the grains Amaranthus cruentus, A. caudatus and A. hypochondriacus, the leafy vegetable and ornamental A. tricolor, and the well-known invasive plant A. retroflexus1,2. In addition, the important agricultural weeds A. palmeri and A. tuberculatus are developing herbicide-resistant biotypes, which in turn can spread globally as seed contaminants in grain3,4. Amaranthus species can be difficult to identify, generally requiring adult plants. Genetic tools are urgently needed to rapidly identify Amaranthus samples at any life stage, and to recognize the emergence or entry of new herbicide-resistant biotypes.

In classical taxonomy, Amaranthus is divided into roughly three subgenera5, but these aren’t fully supported by genetic studies. Two subgenera, Amaranthus subgen. Acnida (L.) Aellen ex K.R. Robertson and Amaranthus subgen. Albersia (Kunth) Gren. & Godr., are not monophyletic groups based on the analysis of nuclear, chloroplast genes and genomes6,7. Phyletic position of some species varies depending on which gene is used. The subgeneric classifications of A. palmeri, A. dubius and A. spinosus, A. tuberculatus and A. arenicola, requires further clarification.

Amaranthus palmeri and A. tuberculatus have both evolved herbicide resistant (R) biotypes to at least four herbicide groups: those that target acetolactate synthase (ALS), photosystem II, protoporphyrinogen oxidase (PPO), and 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase8. This is resulting in considerable losses to agriculture, and the continued evolution of resistance poses a significant threat to global food security9,10,11. This is especially concerning as their seeds are often intercepted in abundance among imported grains, and the R biotypes are very likely to be introduced as seed contaminants into importing countries. Jasieniuk et al. considered that the initial frequency and inheritance of resistant alleles are the key factors contributing to the evolution and spread of herbicide-resistant weeds12. Genetic tools are therefore needed to both identify herbicide resistance and to detect the global movement of R alleles.

Acetolactate synthase (ALS) is nuclear encoded, produced in the cytoplasm and transported via a transit peptide to the chloroplasts13. One mechanism of ALS inhibitor resistance is target site-based resistance, which is due to a single nucleotide substitution in the ALS gene14. These mutations involve amino acid substitutions at Ala122, Pro197, Ala205, Asp376, Arg377, Trp574, Ser653, and Gly65415. Under herbicide selection pressure, the R ALS alleles are dominant over the susceptive (S)16. Therefore, the frequency of resistance occurrence to ALS inhibitors and mutation frequency of ALS should be high. ALS inhibitor resistance has been reported in 66 species globally13, including in six Amaranthus species15. The ALS gene has also served as a useful molecular marker to study Amaranthus17.

ITS sequences have been widely used to resolve phylogenetic issues at different taxonomic levels18, and are selected as candidate barcodes for plants19. Song et al. (2000) and Xu et al. (2017) constructed phylogenetic trees among 16 and 23 species (respectively) of Amaranthus in China based on ITS20,21. Murphy et al. (2017) successfully detected A. palmeri in mixed samples using a quantitative PCR method using ITS22. Waselkov et al. (2018) analyzed the phylogenetic relationships of 58 species in Amaranthus based on ITS and a further five genes7. Murphy and Tranel (2018) used species-specific SNPs within the ITS region to identify and validate Amaranthus spp.23. All of these studies show that ITS is useful for identifying Amaranthus species, although limitations remain.

In this study we chose the ALS and ITS gene regions as molecular markers to analyze the phylogeny of Amaranthus, reexamine the taxonomic status of species encountered in China and to find effective SNPs for identifying species that are difficult to distinguish morphologically. Furthermore, we tested for the presence of ALR R alleles in China, with a special focus on samples obtained from grain imports and of naturalized Amaranthus populations in and around port facilities. This is a first step towards quantifying the global movement of herbicide-resistant alleles through contaminants in the global grain trade.

Results

Sequencing analysis

125 sequences of ITS, 85 sequences of ALS (domains C, A and D) and 78 sequences of ALS (domains B and E) in Amaranthus were generated for this study (Table S1). Both ALS regions were successfully amplified in 65 samples (16 Amaranthus species). These two partitioned regions were combined for phylogenetic analysis. All positions containing gaps and missing data were eliminated, and both ends of primers were cut off. There were a total of 621-nt to 624-nt of ITS, 429-nt of ALS (domains C, A and D), 604-nt to 605-nt of ALS (domains B and E) in the final data set. The ITS sequences had 67 variable sites and 61 Parsimony-Informative Sites (PIS), ALS (domains C, A and D) had 28 variable sites and 24 PIS and ALS (domains B and E) had 48 variable sites and 33 PIS.

Phylogenetic analysis

A total of 126 ITS and 66 ALS sequences from 27 Amaranthus species and one outgroup (Bassia scoparia, Genbank accessions: EU517465.1, MH711446.1) were used to construct the phylogenetic trees. According to genetic tree produced by combining ITS and ALS, four clades can be differentiated. Amaranthus palmeri and A. spinosus grouped together in one clade (PAL + SPI) (BS/PP = 98/0.92), A. tuberculatus and A. arenicola grouped together in another (TUB + ARE) (BS/PP = 60/0.59), and species in subgen. Amaranthus (AM) (BS/PP = 100/1) and subgen. Albersia (AL) (BS/PP = 91/0.8) formed their own clades (Fig. 1). In the ITS tree, except A. arenicola clustered with A. tuberculatus closely (BS/PP = 96/1), and other branches were maintained basically besides BS/PP value were decreased slightly (Fig. S1). These clades did not completely align with those produced in the ALS tree. Amaranthus palmeri, A. spinosis and subgen. Amaranthus formed a monophyletic clade (BS/PP = 91/1), with two branches of A. spinosus (BS/PP = 99/1; BS/PP = 76/-), A. dubius (subgen. Amaranthus) and the rest of subgen. Amaranthus (BS/PP = 99/0.99) nested within A. palmeri. Subgen. Albersia again formed a single clade. Amaranthus arenicola (ARE) was no longer clustered with A. tuberculatus, instead becoming a basal clade to A. palmeri, A. spinosus, A. dubius and subgen. Amaranthus (BS/PP = 99/1) (Fig. 1). Amaranthus dubius was not aggregated with subgen. Amaranthus any more (BS/PP = 54/-) (Fig. 2).

Figure 1
figure 1

A maximum likelihood combined gene tree based on ITS and two regions within ALS of Amaranthus and one outgroup. Values at each node indicate maximum likelihood bootstrap support (BS)/Bayesian inference posterior probability (PP) value. Individuals marked with grey backgrounds represent genetically distinct groupings.

Figure 2
figure 2

Maximum likelihood gene trees based on two regions within ALS. Amaranthus clades are delimited with grey highlight indicating anomalous samples (see text). Values at each node indicate maximum likelihood bootstrap support (BS)/Bayesian inference posterior probability (PP) value. The branches of A. albus, A. blitoides, A. blitum, A. palmeri, A. spinosus and parts of A. tuberculatus were compressed.

On the phylogenetic tree produced by combining ITS and ALS sequences, A. palmeri, A. spinosus and A. tuberculatus showed considerable intraspecific variation and structuring, especially A. tuberculatus. The A. tuberculatus clade can be subdivided into multiple groups according to the mutations located between loci 1,770 and 1794 on the ALS gene. These included groups of homozygous individuals (ATCA, ATCG, and TAAG) and individuals with heterozygous mutations at these loci or with irregular point mutations at other loci (Fig. 1, Table 1). Amaranthus palmeri was split up into three sections based on two mutations on the ALS gene (Fig. 1, Table 1). Amaranthus spinosus branched out into two groups by three mutations on the ALS (Fig. 1, Table 1).

Table 1 Comparison of SNPs between the similar species: Amaranthus tuberculatus, A. arenicola, A. palmeri, A. spinosus, A. dubius and A. hybridus on ALS (domains C, A and D) and ALS (domains B and E).

Interspecific and intraspecific variations on ITS and ALS

The average interspecific and intraspecific patristic distances were 0.02582 and 0.00054 respectively for the 125 ITS sequences, and 0.0155 and 0.00162 for the 65 ALS sequences (including the two partitioned regions). ITS was conserved relative to the ALS gene. Intraspecific variation frequency in ALS was greatest for A. tuberculatus (5.13%), A. palmeri (4.81%), and A. spinosus (0.75%), much higher than observed in ITS (0.32%, 0.16%, and 0.00% respectively). Of the 24 A. tuberculatus samples for which both ALS regions were sequenced, there were 22 PIS on ALS, but only one SNP on ITS.

Amaranthus arenicola had similar ALS (domains C, A and D) genotypes to A. palmeri and similar ALS (domains B and E) and identical ITS genotypes to A. tuberculatus (Table 1). Amaranthus dubius (subgen. Amaranthus) had six heterozygotes across the two ALS regions, with half of the alleles of six heterozygotes being identical to those of A. spinosus, and the remainder being identical to the rest of subgen. Amaranthus (Table 1). Three stable mutations of A. spinosus in ALS resulted in all of them grouping into two sections (Fig. 1, Table 1), despite these individuals only having one SNP in ITS. Meanwhile, on the above same loci, several A. palmeri also had corresponding heterozygotes (Table 1), and one individual was almost consistent with A. spinosus with respect to two regions of ALS except for six heterozygotes (Fig. 2). In addition, there were one to two SNPs in A. albus and A. retroflexus, 6 indels in A. blitum on ITS, and two SNPs in A. retroflexus on ALS. Amaranthus standleyanus and A. crispus only differed by one base. Amaranthus capensis differed from A. tenuifolius and A. tricolor by one SNP in ITS.

R ALS alleles in A. arenicola, A. palmeri and A. tuberculatus

In our analysis of ALS, 25 plants with R ALS alleles were detected. Sixteen A. tuberculatus (55.2% of samples), eight A. palmeri (27.6%) and one A. arenicola (100%) were detected with Ala122Asn, Pro197Ser/Thr/Ile, Trp574Leu, and Ser653Thr/Asn/Lys substitutions (Table 2). Of the 31 amino acid mutations of 25 R biotypes, 77.4% were heterozygous. The Ala122Asn mutations were mainly generated in Fujian’s population of A. tuberculatus. However, Ser653Thr/Asn/Lys mutations were found in A. tuberculatus populations from Jiangsu and Shandong (Table 2). The major resistant substitutions in A. palmeri were Pro197Ser/Thr and Trp574Leu: Acession No. 7229 sample had homozygous mutations of Trp574Leu, while the remaining mutations were heterozygous (Table 2). Five individuals of A. tuberculatus and one A. arenicola had two amino acid substitutions (Table 2). The amino acid substitutions of A. arenicola mostly occurred in Pro197Ile and Ser653Thr (Table 2). Among them, the substitutions of Ala122Asn, Pro197Thr/Ile and Ser653Lys were the first to be observed in Amaranthus.

Table 2 Amino acid substituts in sampled species with R ALS alleles*.

Discussion

Our research demonstrates the considerable value of ALS for clarifying phylogenetic relationships within the genus Amaranthus, in assisting with the identification of species that are morphologically difficult to distinguish, and for identifying the presence of herbicide-resistant genes. The main findings and their significance are discussed below.

Our phylogenetic analysis supported subgen. Albersia as a single clade, but the boundary between subgen. Amaranthus and subgen. Acnida was confused by A. palmeri, A. spinosus, A.arenicola and A. dubius. Previous morphological and molecular studies have found A. spinosus and A. palmeri to be closely related24. For example, Riggns et al. (2010) classified A. spinosus as a sister group of A. palmeri based on the ALS gene17.The sequence homology of EPSP synthase (EPSPS) between glyphosate-resistant A. spinosus and glyphosate-resistant A. palmeri supported the hypothesis that the EPSPS amplicon in A. spinosus originated from A. palmeri25. We have observed a similar situation in the ALS gene, namely A. spinosus was grouped into two sections by three stable mutations in ALS and formed parallel branches with A. palmeri. Furthermore, several A. palmeri plants growing together with A. spinosus also had corresponding heterozygotes at the above loci, one of which was consistent with A. spinosus with respect to two regions of ALS except for six heterozygotes. The other sampled A. spinosus individuals came from different regions, grew alone, and have naturalized in inner provinces. Although A. spinosus and A. palmeri have many opportunities to implement gene exchange in nature26, it seems that these intraspecific mutations in A. spinosus have been formed for a long time, and were not an occasional or transient result of hybridization.

A. arenicola is a dioecious species. Female flowers have five perianths like A. palmeri and other species in subgen. Amaranthus, whereas its’ short bracts and long spikes are similar to A. tuberculatus. Stetter and Schmid (2017) placed A. arenicola and A. palmeri together based on genotyping by sequencing methods6. In contrast, Waselkov et al. (2018) concluded from their analysis of four nuclear genes and two chloroplast sequences that A. arenicola is closely related to A. tuberculatus, and believed that the samples of A. arenicola used by Stetter and Schmid (2017) were wrongly identified6,7. We also found that A. arenicola was closer to A. palmeri than A. tuberculatus based on ALS (domain C, A and D), while on ITS and ALS (domain B and E), A. arenicola was basically within the variation range of A. tuberculatus. This discovery is very interesting as we observed A. arenicola, A. tuberculatus, A. palmeri and A. hybridus growing together in the port monitoring area with many heterozygous mutations at some loci being shared among them. The significance of this finding warrants further study.

Amaranthus tuberculatus was previously considered to be two largely allopatric species, A. tuberculatus and A. rudis, according whether the fruit is dehiscent or not, the number of perianth segments, and different geographical origins27. However, a large number of morphological intermediate forms exist between the two, and they are now treated as one species27. Riggins et al. (2010) reported heterozygous mutations on ALS for six individuals of A. palmeri and A. tuberculatus17. In our study, high SNP diversity in ALS resulted in considerable genotypic structuring within A. tuberculatus, but it did not correspond with our morphological sub-species identifications. Waselkov et al. (2018) speculated that present-day A. tuberculatus was derived from hybridization of two ancestors originating from different dioecious species7. Combined with our finding, A. arenicola has homologous mutation types at the same loci as A. tuberculatus at ALS (domain B and E) and ITS. In the place of origin (North America), the male plants of A. arenicola are often incorrectly identified as A. tuberculatus28. We speculated that in some cases the male plants of A. tuberculatus might male with the females of A. arenicola when it entered into a new habitat where is absent of conspecific male plant.

Amaranthus dubius is a known allotetraploid that originated through hybridization between A. spinosus and other species in the A. hybridus aggregate29. Waselkov et al. (2018) found that A. dubius is strongly supported as the sister species to A. spinosus in the chloroplast tree, but clustered with A. hybridus aggregates in the nuclear tree7. In our study, the phylogenetic position of A. dubius on ALS was an out-group to the remaining species in the subgen. Amaranthus. This was supported by the results from SNP and AFLP Package analysis using biallelic markers6. We can deduce from the heterozygote loci that A. dubius has a hybrid origin7.

Herbicide resistance already poses significant challenges for managing Amaranthus weeds in its native range. We detected Ala122Asn, Pro197Thr/Ile and Ser653Lys substitutions on the ALS gene for the first time in Amaranthus. According to the website International Survey of Herbicide Resistance8, the Ala122Asn has previously only been reported on Echinochloa crus-galli var. crus-galli in 2017 where it resulted in strong resistance to four kinds of herbicides15. The substitution of Pro197Thr is relatively common and has been reported on numerous species, whereas the substitution of Pro197Ile had only previously been reported on Sisymbrium orientale15. Both amino acid substitutions lead to herbicide resistance15. The newly discovered substitution Ser653Lys has not previously been reported on any species, and its significance remains unclear.

All of the R ALS alleles we detected were in plants sampled from border monitoring areas within China. These alleles are therefore most likely to have entered China as contaminants of imported grain. Up to 77.4% of the R ALS alleles were heterozygous. R ALS alleles are dominant over S alleles, so can still be selected even under heterozygous conditions16. This is the first study of R ALS alleles in Amaranthus within China, and little is known of their distribution globally beyond some studies in North America. Further work is needed to better understand the threat these and other alleles pose to agriculture in China, including how well these new R biotypes will spread and perform.

We can only make preliminary inferences from genes. Homozygous mutations of Trp574Leu were detected in a sample of A. palmeri coded No. 7229 taken from the Beijing population in 2004. This population was within a processing plant for imported grain. In 2012, two heterozygous mutation biotypes of Trp574Leu and three heterozygous mutation biotypes of Pro197Ser were detected in the samples collected within 1 km of No. 7229. This suggests that after nearly ten years of establishment, the R ALS alleles have spread in the population as heterozygotes. Further work is required to test this hypothesis.

Conclusions

Through our comparative analysis of ALS, we found that ALS can provide many valuable SNPs for species identification, such as for difficult species, A. arenicola and A. dubius, even the different genotypes under species level. On the basis of SNPs, suitable restriction enzymes can be found, and optimal primers can be designed. Moreover, different herbicide-resistant biotypes belonging to different populations of A. tuberculatus shows that these populations have different origins. Through genetic testing, resistant biotypes can be found quickly. This will be critical for managing herbicide resistance in Amaranthus into the future.

Materials and methods

Plant materials and morphological identification

We collected 153 plant and seed samples from 26 species of Amaranthus between 2005 and 2018 (Table S1). 48 plant samples were taken from populations that appear to have been naturalized for some time (including one from Spain). 96 samples were from plants growing in or adjacent to ports, wharfs and processing plants of imported grain where imported grains may leak. These are most likely to have established from contaminated seed. Nine accessions were seeds intercepted from grains imported from the United States (8) and Canada (1). Most samples (n = 109) were plant samples collected from imported grain processing plants and wharfs located at ports within China. 35 plant samples were taken from fields, wastelands and lawns in China and one in Spain.

Seeds were sowed in a Beijing greenhouse (14–25 °C, humidity 50–70%) from April to May (2008). The resulting eight adult plants were harvested for further study. Identification relied on Amaranthus monographs1,2,28,30. One adult plant of A. arenicola originated from intercepted seeds was collected in a plant quarantine nursery. We also referenced the type specimen of Amaranthus arenicola I.M. Johnst. (Collector: A.S. Hitchcock, #428A, stored in Missouri Botanical Garden (MO), MO-247457) on the website of JSTOR Global Plants (www.jstor.org).

DNA extraction and PCR

About 10 mg leaves of each individual of 153 samples were dried by Silica for DNA extraction. These dried leaves were ground into a powder by Grinding Mill (Retsch MM400, Germany) for 1 min (1,800 r·min−1), and DNA was extracted by Plant Genomic DNA Kit (Tiangen Biotech Co., China). Amaranthus palmeri S. Watson (ITS accession: KF493784.1, KC747433.1 and KP318856.1; ALS accession: KY781923.1) and Amaranthus retroflexus L. (ALS accession: AF363369.1) were used as resistant biotype references.

A pair of universal primers ITS1 (5′-TCCGTAGGTGAACCTGCGG-3′) and ITS4 (5′- TCCTCCGCTTATTGATATGC-3′) were used to amplify the ITS region. The PCR reaction mixture contained 1 μL of DNA template (approximately 30 ng of genomic DNA), 10 × of Ex Taq buffer without MgCl2, 10 μM each of forward and reverse primers, 2.5 mM of each dNTP, 2.5 mM MgCl2 and 1 U of ExTaq polymerase (Takara) with ddH2O for a final volume of 25 μL. PCR cycling conditions were: template denaturation at 94 °C for 5 min followed by 30 cycles of denaturation at 94 °C for 1 min, primer annealing at 55 °C for 1 min, and primer extension at 72 °C for 90 s, with a final extension of 10 min at 72 °C.

A pair of primers 2F (5′-GATGTYCTCGTYGARGCTCT-3′) and 2R (5′-AAYCAAAACAGGYCCAGGTC-3′) were used to amplify a region of the ALS gene containing nucleotides coding for Ala122 (conserved nucleotides in domain C), Pro197 (domain A), and Ala205 (domain D), all of which have been linked to ALS inhibitor resistance14. Another pair of primers 5F (5′-ATTCCTCCGCARTATGCSATT-3′) and 6R (5′-CCTACAAAAAGCTTCTCCTCTATAAG-3′) were used to amplify the region within the ALS encompassing domains B (comprising Ser653) and E (the carboxy terminal region, including 13 amino acids after the stop codon14). The PCR reaction mixture contained 1.3 μL of DNA template (approximately 30 ng of genomic DNA), 10 × of Ex Taq buffer without MgCl2, 10 μM each of forward and reverse primers, 2.5 mM of each dNTP, 2.5 mM MgCl2 and 1 U of ExTaq polymerase (Takara) with ddH2O for a final volume of 25 μL. PCR cycling conditions were: template denaturation at 94 °C for 10 min followed by 35 cycles of denaturation at 94 °C for 1 min, primer annealing at 57 °C for 80 s, and primer extension at 72 °C for 80 s, with a final extension of 10 min at 72 °C. Each PCR product was sequenced in forward and reverse directions to minimize sequencing errors.

Phylogenetic analysis

The alignment and adjustment of multiple sequences and SNPs detection were carried out by MAFFT v7.45031 and Geneious Prime v 2020.1.2 (Biomatters, Auckland, New Zealand). The aligned coding sequences of ITS and ALS (domains C, A and D; domains B and E) were concatenated. The DNA substitution model (GTR + I + G model) was chosen using jModelTest 2.1.632, and used in maximum likelihood (ML) analysis and Bayesian inference. ML analysis was conducted using RAxML version 8.0.033 on the Geneious Prime v 2020.1.2 (Biomatters, Auckland, New Zealand). Bayesian inference was conducted using MrBayes 3.2.634 with Ngen = 1,000,000, Samplefreq = 200, and Burninfrac = 0.25. Bassia_scoparia (GenBank Accession: MH711446.1 and EU517465.1) served as the outgroup.