Glycyrrhiza uralensis Fisch. is a perennial herbaceous plant of the legume family. Its dried roots and rhizomes are often used as medicines1 and have a wide range of pharmacological effects, such as heat-clearing, detoxifying, cough-relieving, and phlegm-dissolving effects2 and harmonizing the effects of various drugs. Moreover, G. uralensis possesses anti-inflammatory3, antibacterial4, antioxidant, antiviral5,6, and hepatoprotective activities. In addition, G. uralensis is commonly used in cosmetics, health products, and the food industry7. For instance, glycyrrhizic acid derived from G. uralensis is 50 times as sweet as sucrose and is frequently employed as an additive in confectionery, sauces, and other food items8.

In China, the Ordos Plateau in Inner Mongolia serves as a prominent source of G. uralensis. Located in the transition zone between the Wind - Sand Plateau and the Loess Plateau9, this region has a rugged and complex terrain, with higher elevations in the northwest gradually descending to lower elevations in the southeast. The area benefits from abundant sunshine, intense radiation, high effective cumulative temperatures, wide diurnal temperature variations, soft sandy soils, and nutrient-rich calcium-rich substrates. These features contribute to the exceptional resilience of local G. uralensis to drought, wind, sand, cold, heat stress, and salinity10, while it also exhibits early rejuvenation and rapid growth. The G. uralensis produced in this area ranks first in terms of quality among similar products worldwide. It is exported globally for its reddish skin color, sufficiently powdery texture, high acid content, glossy cut surface, richness in trace elements, and high medicinal value. G. uralensis also grows in desert and semidesert areas11, and its ecological characteristics have made an important contribution to the maintenance of the ecological environment in the western part of China. In 2002, China classified it as a key controlled medicinal herb and a national second-grade protected plant.

China is the world’s largest producer and exporter of G. uralensis. The area planted with G. uralensis exceeds 66,667 hectares and accounts for about 90% of the total global planted area. China accounts for more than 40% of the global market share of G. uralensis, and imports have reached 80% of the global total. Inner Mongolia is the main area of cultivation of G. uralensis in China, and the area planted with G. uralensis has reached 30% of the total national planted area. As a medicinal plant with economic and ecological benefits, G. uralensis has attracted much attention from all walks of life. However, with the increase in its utilization rate and the expansion of the scope of its exploitation and utilization in China and internationally, wild G. uralensis falls short in terms of meeting human demand. Consequently, artificial cultivation of G. uralensis has emerged as an effective measure for achieving sustainable utilization of this valuable resource.

At present, the main techniques employed in the artificial cultivation of G. uralensis in China include land preparation, seed selection, sowing, field management, and harvesting. Seed selection is the most important step in the whole cultivation process. An excellent seed supply can greatly improve the quality and yield of cultivated medicinal materials. However, according to a survey, seeds of mixed varieties of G. uralensis and seeds with unclear sources are circulating on the market. This situation leads to a waste of land resources and poor performance in field traits after planting. The difference in quality is obvious, which has seriously restricted the healthy development of the G. uralensis industry12. At present, no comprehensive screening standards for G. uralensis seeds have been established in China. “Rules for agricultural seed testing,” “Rules of seed testing for forage,” and “The code of practice for seed testing of Chinese medicinal materials” are being temporarily used as standards for evaluating G. uralensis seeds. The research of Sun et al.13, Zhang et al.14 and Li et al.15 only focus on basic indicators such as purity, germination rate (GR), moisture content (MC), and thousand grain weight (TGW). With the development of the seed industry, however, the requirements of seed quality evaluation are increasing. Therefore, a comprehensive multi-index screening standard for G. uralensis seeds that indicates the reliability of sowing quality will not only provide a methodological reference for the formulation of standards for G. uralensis seeds but also guarantee improvements in the quality and yield of G. uralensis.

In order to solve the above problems, this study established a Standard for screening high-quality G. uralensis seeds were thereby established by determining and analyzing indices of the purity, morphology, TGW, hardness rate, GR, MC, viability, vigor, and health of G. uralensis seeds from the Ordos Plateau circulating on the market. In comparison with other screening criteria for G. uralensis seeds, in this study an exclusive standard for the special ecological area of the Ordos Plateau was established for the first time. The determination of seed coat color, size, viability, vigor, and other indicators was also included.

High-quality seeds are the primary condition to ensure the quality of medicinal materials at the source. The comprehensive screening standard of G. uralensis seeds established in this study will not only improve the scientific rigor and practicability of seed quality evaluation. It will also help in guiding the standardized agricultural production of G. uralensis. It is of great significance to ensure the quality of medicinal materials while improving efficiency of planting and yield of large-scale cultivated G. uralensis, and to promote the sustainable development of Chinese herbal medicine industry.

Results

Seed lot purity analysis

The purity analysis of the collected seeds revealed that the difference in weight gained or lost for the sum of the various components was less than 5% of the original weight and that the seeds could be used for subsequent analysis. Moreover, all the seeds collected in this study had a purity of more than 96%, with low contents of impurities and other seeds (Table 1). A high degree of purity provides a basis for grading the quality of G. uralensis seeds, calculating seed prices, and subsequent measurements.

Table 1 Purity analysis results.

Phenotypic characterization results

Table 2 shows significant differences (P < 0.05) in the length, width, and thickness of seeds of different sizes. Seeds in GC1, GC2, and GC3, which all comprised small-sized seeds, were measured and found to have lengths of between 2.44 and 2.50 mm, widths of 2.17–2.20 mm, and thicknesses of 1.39–1.49 mm. There was no significant difference in width between the small-sized seed groups (P > 0.05). Measurements of seeds in GC4, GC5, and GC6, which all comprised large-sized seeds, showed that their lengths ranged from 3.13 to 3.33 mm, their widths were 2.73–2.88 mm, and their thicknesses were 1.63–1.82 mm. There were significant differences between the large-sized seed groups (P < 0.05). Comparison with the small seeds revealed that the large seeds exhibited significant differences in length, width, and thickness (P < 0.05).

Comparison of the TGWs of GC1, GC2, and GC3 revealed that the TGW of GC1 exhibited a significant difference (P < 0.05), with an average value of 6.740 g. Comparison of the TGWs of GC4, GC5, and GC6 revealed that the TGW of GC4 exhibited a significant difference (P < 0.05), with an average value of 12.269 g. This value was roughly twice that of GC1 and showed that a large amount of material was present in the large seeds. Overall, the quality of GC1 and GC4, which were both green, was more prominent (Table 3).

Table 2 Seed morphology results.

The values are presented as the mean ± standard error. Differences in the capital letters represent significant (P < 0.05) differences in morphometry between GC1, GC2, and GC3 and between GC4, GC5, and GC6.

Table 3 TGW determination results.

The values are presented as the mean ± standard error. Differences in the capital letters represent significant (P < 0.05) differences in TGW between GC1, GC2, and GC3 and between GC4, GC5, and GC6. CV represents the coefficient of variation.

Hardness rate results

In this study, the number of unswollen seeds was determined by the seed soaking method, and the hardness rates were compared between groups of seeds. Figure 1A shows that in terms of the hardness rate GC1 > GC4 > GC5 > GC3 > GC2 > GC6. Seeds in GC1 and GC4, which were all green in color, had the highest hardnesses, and the difference was significant (P < 0.05). Research suggests that seed hardness is related to genes, maturity, and the environment16 and that high-hardness seeds can maintain their quality to some extent and have prolonged lifetimes. However, in production, this phenomenon is also a factor that affects seeds that do not germinate for long periods or create obstacles in the breeding process17. It can be concluded that GC2, GC3, GC5, and GC6, which had hard seed contents of less than 60%, may be affected by seed maturity and the growing environment, while GC1 and GC4 can be selected as valuable seeds.

Results of germination test under normal growing conditions

In this study, the tissue paper method was used for germination tests. According to Fig. 1B, the results showed that GC1 > GC4 > GC2 > GC3 > GC6 > GC5 in terms of GR and GC4 > GC1 > GC2 > GC3 > GC6 > GC5 in terms of GI. GC1 and GC4 had higher GR and GI values, and the difference was significant (P < 0.05). Between GC2 and GC3, which both comprised small-sized seeds, and GC5 and GC6, which both comprised large-sized seeds, the difference was significant (P < 0.05). It can thus be shown that green seeds and small seeds have high field sowing value and can influence crop yields at the outset.

Results of germination test under drought stress

In this study, polyethylene glycol was chosen as a material for simulating drought stress because it induces stress, reduces the water potential of cells but does not penetrate them, and is commonly used to simulate drought-stressed environments18. Drought stress experiments were performed using the tissue paper method as in the normal germination tests. According to Fig. 1C, the results showed that the GR and GI of the seeds in each group were significantly lower than those in the normal germination experiments. However, the results for GC1 and GC4 were still significantly higher than those for GC2, GC3, GC5, and GC6, and the difference was significant (P < 0.05). The differences between GC2 and GC3, which comprised small-sized seeds, and GC5 and GC6, which comprised large-sized seeds, were significant (P < 0.05). It has thus been shown that green seeds and small seeds have good drought tolerance and can be used as drought-tolerant varieties in sustainable agricultural production.

MC results

The seed MC was determined by high-temperature drying. According to Fig. 1D, the results of the MC determination showed that GC3 > GC5 > GC6 > GC2 > GC1 > GC4 in terms of MC and that the MC of the seeds in each group was less than 7%. In comparison, GC1 and GC4, which both comprised green seeds, had the lowest MC. The difference between the groups was significant (P < 0.05). It has been shown that seed MC is negatively correlated with seed viability19 in many crops during storage and that the value of the MC influences the safe storage and longevity of seeds.

Seed viability results

Based on the staining results, seeds were classified as viable or unviable (Fig. 2), and the number of viable seeds was expressed as the viability rate. The results in Fig. 1E show that in terms of the viability rate GC4 > GC1 > GC3 > GC2 > GC5 > GC6. The viability rates of both GC1 and GC4 were at least 84%, which was superior to the viability rates of GC2, GC3, GC5, and GC6, and the difference was significant (P < 0.05). GC2 and GC3, which both comprised small-sized seeds, were compared with GC5 and GC6, which both comprised large-sized seeds. It was found that the viability rate of small seeds was about twice the viability rate of large seeds, which represented a significant difference (P < 0.05). It has thus been shown that green seeds and small seeds have strong germination potential.

Fig. 1
figure 1

Determination of indicators. (A) Seed hardness rate of each group. (B) Seed GR and GI of each group under normal growth conditions. (C) Seed GR and GI of each group under drought stress. (D) Seed MC of each group. (E) Seed viability of each group. (F) Seed vigor as represented by electrical conductivity of each group. Statistical analysis was performed using a least significant difference test (P < 0.05), and the number of biological replicates n = 3.

Fig. 2
figure 2

Seed viability determination. (A) Viable seed. (B) Unviable seed.

Seed Vigor results

Electrical conductivity results are negatively correlated with germination test results20. From the results shown in Fig. 1F, it can be seen that in terms of conductivity GC4 < GC1 < GC6 < GC3 < GC5 < GC2, although there were differences between the weights of the seeds in each group (Table 4). In general, it can be seen that GC1 and GC4, which both comprised green seeds, had the lowest conductivity, GC3 and GC6, which both comprised black seeds, had intermediate conductivity, and GC2 and GC5, which both comprised brown seeds, had the highest conductivity, and the difference between the groups was significant (P < 0.05). Correlation of the conductivity results with the results of the germination test revealed a significant negative correlation (R = − 0.720; P < 0.01) between the conductivity and GR in all groups. This correlation indicated that seeds with lower conductivity had a higher GR, which was probably because seeds with a higher GR experienced faster reconstruction of the cell membrane during the initial uptake of water and swelling and less exudation of internal soluble matter.

Seed health test results

As the cornerstone of agriculture, seeds have the potential to enable high crop yields and high crop quality. Some potential threats present in seeds (bacteria, fungi, viruses, etc.) have long been considered the main causes of damage to plants and effective means of introducing plant pathogens into new areas. Therefore, in most countries seed health testing has been carried out regularly. However, with the development of agriculture and technology, known fungi are undergoing subtle changes. For example, in the course of long-term coexistence with plants, endophytic fungi have gradually formed a mutually beneficial symbiotic relationship. This not only promotes plant growth and enhances plant stress resistance but also provides a potential source for the development of natural active substances. As shown in Table 5; Fig. 3, eight strains of fungi, namely, Aspergillus niger, Cladosporium cladosporioides, Alternaria tenuissia, Alternaria alternata, Cladosporium sp., Aspergillus foetidus, Aspergillus sydowii, and Talaromyces sayulitensis, were detected in the seeds. These species were named GZ-01, GZ-02, GZ-03, GZ-04, GZ-05, GZ-06, GZ-07, and GZ-08, respectively. The detection of the above fungi can be used as the basis for the study of plant growth promotion or plant stress resistance.

Fig. 3
figure 3

Fungi isolated from the seeds in each group. (A) GZ-01; (B) GZ-02; (C) GZ-03; (D) GZ-04; (E) GZ-05; (F) GZ-07; (G) GZ-08.

Establishment of quality grading standards

Statistical analysis of G. uralensis seed quality

According to the test procedures specified in the relevant standards, various indices of the G. uralensis seeds collected from the Ordos Plateau were measured. The results are shown in Table 6.

Development of quality grading standards for G. uralensis seeds

The K-means clustering method in SPSS 27.0 was used for cluster analysis of the main test indices. This analysis showed that the main factors affecting the quality of G. uralensis seeds from the Ordos Plateau were purity, TGW, hardness rate, seed GR under normal growing conditions, seed GR under drought stress, MC, viability, and vigor. The seed samples were analyzed and clustered into three classes (Table 7), which serve as the basis for quality grading.

A suitable machine learning model is selected to learn and predict the germination rate and grade of a given sample. K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest are utilized for training, and seed grade is predicted by 3-fold cross-validation, with Random Forest cross-validation result of 1 being the best. Elastic Net, Ridge, and Lasso were used for training to predict seed germination by 2-fold cross-validation, with Lasso having the best results, with a cross-validation result of 0.8662 for normal seed germination, and a cross-validation result of 0.9345 for drought-stressed seed germination.

Table 4 Results of measurement of seed conductivity as index of seed vigor.
Table 5 Fungal strain identification results.
Table 6 Indicators of quality of G. uralensis seeds in each group.

The values are presented as the mean ± standard error. Differences in the capital letters represent significant (P < 0.05) differences between GC1, GC2, and GC3 and between GC4, GC5, and GC6.

Table 7 Quality grading standards for G. uralensis seeds.

Discussion

High-quality herb seeds serve as the most economical and effective input for the development of the herbal medicine industry21. The development of seed quality grading standards for herbal medicines is an important advance for ensuring and improving the quality of seed sources and is a key factor affecting the yield and quality of medicinal herbs. It was found that studies of the evaluation of seed sources and seed quality of medicinal herbs are scarce. The establishment of screening standards for high-quality G. uralensis seeds is therefore important for improving the production of herbal medicines and regulating the market.

In this study, traditional standard procedures for seed quality testing such as the halving method, soaking method, hundred grain weight method, red ink dyeing method, electrical conductivity method, and high-temperature drying method were used to determine the purity, morphology, TGW, hardness, GR, MC, viability, vigor, and health of G. uralensis seeds. Seed purity is a crucial parameter that serves as an essential indicator for assessing seed quality and has a significant influence on the safe storage, germination potential, and commercially valuable traits of seeds22. The results of the analysis of seed purity showed that the seed purity was greater than 96% and the contents of impurities and other seeds were low. These findings indicated that the G. uralensis seeds entering the market had good commercial value. Seed size is a relatively stable trait that is important in the life history of a plant23. According to previous surveys, morphological characteristics of seeds have proved to be useful distinguishing and taxonomic features of legumes24. Seed weight (1000 seeds), length, width, and thickness were used as important indicators to identify G. uralensis taxa at species level. In the analysis of phenotypic traits, significant differences in size and TGW were observed between GC1 and GC4, both of which comprised green seeds. Notably, GC1 and GC4 exhibited pronounced variations in these parameters, and their quality was thus markedly distinguished. After a seed has matured, a large proportion of the seed coat is hard, and it is difficult for water to penetrate it. The phenomenon of a seed being unable to absorb water and sprout immediately after sowing is known as hardness, which is prevalent in nature25. In the hardness rate testing, it was observed that the groups GC1 and GC4 exhibited the highest hardness rates, and all the other test groups had hardness rates of less than 60%. In the seed germination experiment, the seeds were treated with sulfuric acid. Although the GR of the seeds was improved to a certain extent, the treatment time was too long, which would lead to damage to the embryo tissue. In the case of conventional sandpaper friction and mechanical peeling, the treatment efficiency was low, and the seeds were easily damaged. In large-scale cultivation of G. uralensis, suitable treatment methods are key factors that ensure the germination of G. uralensis seeds. Germination experiments are conducted to determine the maximum germination potential of seed lots, to compare the quality of different seed lots, and to estimate the value of sowing in the field. The GR has a direct effect on crop yields26. In the germination test under normal growing conditions, it was found that GC1 and GC4 exhibited relatively high GR and GI values, which indicated that these seeds were valuable for field sowing. With the progressive increase in global temperatures and decrease in precipitation, drought stress has become a major factor affecting crop germination and yields27. In the germination test under drought stress, it was found that, although the GR and GI values of all groups were significantly reduced, the results for GC1 and GC4 remained outstanding, which indicated that these seeds possessed a certain degree of drought tolerance. The seed MC can be an important factor affecting seed longevity and quality28. A high MC in seeds will lead to the occurrence of mold, reductions in medicinal components, and frost damage. Controlling the MC is therefore an important step in ensuring the quality of seeds29. In the MC test, it was found that GC1 and GC4 had the lowest MC values. In comparison with the other groups, these seeds had greater storage value. The red ink staining method was used to determine the potential ability of the seeds to germinate or the viability of the seed embryo based on the semipermeability of the protoplasmic membrane of the seed embryo cells. In the viability test, it was found that the proportions of viable seeds in GC1 and GC4 were significantly higher than those in the other experimental groups. This finding indicated that the green seeds had greater germination potential. Due to market competitiveness and demand, seed germination tests can no longer be the only method used to determine seed viability. The conductivity method has become one of the most widely studied rapid tests due to its simplicity, low cost, objectivity, and speed. In the vigor test, it was found that GC1 and GC4 exhibited the lowest electrical conductivity. Moreover, there was a significant correlation (P < 0.01) between the electrical conductivity of the seed extract solutions in each group and the GR. Finally, in the course of long-term coexistence with seeds, known fungi are undergoing subtle changes. In the seed health test, it was found that the seeds in each group contained different levels of microorganisms, of which eight species of fungi were detected. These were Aspergillus niger, Cladosporium cladosporioides, Alternaria tenuissia, Alternaria alternata, Cladosporium sp., Aspergillus foetidus, Aspergillus sydowii, and Talaromyces sayulitensis, respectively. The detection of the above fungi may provide new ideas for plant seed resistance.

With comprehensive consideration of the requirements of seed testing for speed, effectiveness, and operability, eight indices, namely, purity, TGW, hardness rate, GR under normal growing conditions and under drought stress, MC, viability, and vigor, were finally selected as the reference criteria of grading standards for G. uralensis seeds. The K-means clustering method in SPSS 27.0 was used to perform an analysis of these indices and formulate quality grading standards for G. uralensis seeds. These quality grading standards classify seeds into three grades, as follows: small-seed TGW ≥ 6.7 g, large-seed TGW ≥ 12.2 g, purity ≥ 99%, hardness rate ≥ 85%, GR under normal growing conditions ≥ 85%, GR under drought stress ≥ 70%, MC ≤ 7%, viability ≥ 80%, conductivity ≤ 20 µS·(cm·g)−1, and smooth surface for first-grade seeds; and small-seed TGW ≥ 4.6 g, large-seed TGW ≥ 11.8 g, purity ≥ 96%, hardness rate ≥ 50%, GR under normal growing conditions ≥ 30%, GR under drought stress ≥ 30%, MC ≤ 7%, viability ≥ 60%, conductivity ≤ 100 µS·(cm·g)−1, and smooth or slightly wrinkled surface for second-grade seeds. However, second-grade seeds have more than two unqualified indices, and therefore seeds that are less than first-grade are classed as failing screening.

Compared with the research of Sun et al.13, Zhang et al.14 and Li et al.15, this study focused on the characteristics of genuine medicinal materials in Ordos Plateau, combined with the arid and semi-arid growth environment of seeds, for the first time, the seed coat color and particle size were included in the classification, and the indexes of hard seed rate, germination rate, viability and vigor under drought stress were added and quantified for the first time. Screening drought-tolerant varieties, quantifying seed vigor with electrical conductivity, avoiding 'high germination rate, low vigor’ seeds flowing into the market, breaking through the limitations of 'emphasizing morphology over physiology’ in the existing literature, and the grading standards are more in line with regional production reality. The establishment of this standard can provide guarantee for improving the quality and yield of licorice in Ordos Plateau, and also shows that the green group seeds in Ordos Plateau licorice seeds have more field sowing value. Standardized seed screening can guide farmers in selecting high-quality provenances and improve the efficiency of planting and the production of medicinal materials. It can also help in the construction of the Good Agricultural Practice system for Chinese medicinal materials and promote the sustainable development of the industry. It lays a foundation for the combination of seed screening criteria and modern molecular marker technology and the establishment of multi-index criteria for evaluating the quality of G. uralensis seeds. It also enables improvements in the comprehensiveness and reliability of screening, as well as improvements in the quality supervision system for seeds of medicinal plants such as G. uralensis.

Materials and methods

Acquisition of materials

G. uralensis seeds were acquired from market circulation channels on the Ordos Plateau, as well as from planting sites such as Duguitala, Xini, and Jirigalangtu, in February, June, and September 2023. These seeds were provided for use in this study.

Classification of materials

Seeds were first analyzed for purity and were then passed through a 10 mesh sieve before being differentiated into six different classes by size (small and large) and color (green, brown, and black). These classes were numbered as GC1–GC6 (GC1: small green seeds; GC2: small brown seeds; GC3: small black seeds; GC4: large green seeds; GC5: large brown seeds; and GC6: large black seeds) for subsequent experiments.

Purity analysis

The specific steps used for testing seed purity were as follows:

(1) The collected seeds were poured onto a smooth, clean flat surface, mixed several times using plastic sheets, and stacked in a quadrilateral pile.

(2) The quadrilateral seed pile was divided evenly into two parts, then divided evenly into four parts, and finally divided into eight parts.

(3) The eight parts of the seed pile were neatly arranged into two rows, and four parts are cross-retained.

(4) The above steps were repeated until a sample of the desired weight was dispensed.

(5) Samples were selected and tested for obvious differences in size and weight and hence serious effects on test results for the mixture. Then the test samples were differentiated into three categories, namely, authentic seeds, other plant seeds, and impurities, and a balance with a precision of 0.001 g was used for weighing.

(6) After weighing, the sum of the weights of the various components was compared with the original weight, and the percentage purity was calculated if the difference in weight gained or lost was less than 5% of the original weight.

$$\begin{aligned}&\text{Purity} (\%) = \text{weight}\, \text {of}\, \text {authentic}\, \text {seeds}/(\text{weight}\, \text{of}\, \text{authentic}\, \text {seeds} \\&\quad\quad\quad\quad+ \text{weight}\, \text{of}\, \text{impurities} + \text{weight}\, \text{of}\, \text{other}\, \text{plant}\, \text{seeds}) \times 100 \end{aligned}$$
(1)

Phenotypic characterization of seeds

Vernier calipers with a precision of 0.01 mm were used to measure the length, width, and thickness of seeds30. Fifty determinations were performed for each set of measurements, which was replicated three times.

One hundred seed samples from each test group were weighed on a balance with a precision of 0.001 g, and the measurements were replicated eight times. The mean, coefficient of variation, and standard deviation of the eight replicates were calculated and converted to 1000 seed weights.

Hardness rate determination

In each test group 100 seeds were randomly counted out and soaked in water for 24 h. The number of unswollen seeds, which were regarded as hard solid seeds, was counted, and the measurement was replicated three times. The average of the three replicates of the data was taken as the standard value31.

$$\text{Hardness}\, \text{rate}\, (\%) = (\text{number}\, \text{of}\, \text{unswollen}\, \text{seeds}/\text{number}\, \text{of}\, \text{test} \text{seeds}) \times 100$$
(2)

Seed germination test under normal growing conditions

In each test group 50 seeds were randomly counted out, and the test was replicated eight times32. Firstly, the seeds underwent seed coat crushing and disinfection following these specific steps: soaking in 98% H2SO4 for 35 min; rinsing with distilled water five times; soaking in 70% ethanol for 3 min; rinsing with distilled water five times; sterilizing with 0.1% HgCl2 for 1 min; and rinsing with sterilized distilled water five times. Subsequently, the treated seeds were inoculated onto germination cassettes padded with two layers of moistened filter paper (to which 15 mL sterilized distilled water was added), which were placed in a light culture box at 26 °C with a light cycle of 16 h/day for 8 days.

The number of germinating seeds was counted from the second day after inoculation. Germination was considered to have occurred when the radicle had broken through the seed coat by at least 2 mm33. The GR and germination index (GI) were calculated at the end of the eighth day.

$$\text{GR}\, (\%) = (\text{number}\, \text{of}\, \text{germinating}\, \text{seeds}/\text{number}\, \text{of}\, \text{test}\, \text{seeds}) \times 100$$
(3)
$$\text{GI} = \sum \text{number}\, \text{of}\, \text{germinating}\, \text{seeds}\, \text{on}\, \text{a}\, \text{given}\, \text {day}/\text{corresponding}\, \text{number}\, \text{of}\, \text{germination}\, \text{days}$$
(4)

Seed germination test under drought stress

In each test group 50 seeds were randomly counted out, and the test was replicated eight times32. The seed treatment method and calculation of the GR and GI were as described in Sect. 2.6. The filter paper in the germination cassette was moistened with 20% polyethylene glycol-6000 solution and inoculated with the treated seeds (15 mL polyethylene glycol solution was added). The GR and GI were calculated at the end of the experiment.

MC determination

Crushed seeds were placed in constant-weight aluminum boxes (5 ± 0.001 g/box) and dried at 133 ± 2 °C until a constant weight was achieved (about 2 h). After drying, the aluminum boxes containing the samples were placed in a desiccator cooled to room temperature and were weighed to calculate the MC34.

$$\begin{aligned}& \text{MC} (\%) = (\text{weight}\, \text{of}\, \text{aluminum}\, \text{box}\, \text{and}\, \text{sample}\, \text{before}\, \text{drying} - \text{weight}\, \text{of}\, \text{aluminum}\, \text{box}\, \text{and}\, \text {sample}\, \text {after}\, \text{drying})\\&\quad\quad\quad\quad/(\text{weight}\, \text{of}\, \text{aluminum}\, \text{box}\, \text{and}\, \text{sample}\, \text{before}\, \text{drying} - \text{weight}\, \text{of}\, \text{aluminum}\, \text{box}) \times 100 \end{aligned}$$
(5)

Seed viability determination

A seed viability test was carried out using the red ink staining method. In each test group 50 seeds were randomly counted out. The seed coat was ground with sandpaper, and the seed was soaked in water for 12 h at room temperature. Then the whole seed coat was removed, and the complete cotyledons and embryos were taken out and dyed with 5% red ink for 10–15 min. After the dyeing time had elapsed, the dyeing solution was poured off, and the seeds were washed with tap water until the effluent was colorless. According to observations of the staining (no or slight staining of the embryo, the seed was considered viable; the embryo was completely red and of the same color as the endosperm, the seed was considered unviable or mostly unviable), the percentage seed viability was calculated35. The test was repeated four times in each group.

$$\text{Viability}\, (\%) = (\text{number}\, \text{of}\, \text{viable}\, \text{seeds}/\text{number}\, \text{of}\, \text{test}\, \text{seeds}) \times 100$$
(6)

Determination of seed conductivity as index of seed Vigor

In each test group 50 seeds were randomly counted out and weighed using a balance with a precision of 0.001 g. After weighing, the seeds were washed with distilled water. Clean filter paper was used to absorb the water. The seeds were placed in a clean beaker, to which 100 mL ultrapure water was added at the same time. A conductivity instrument was used to determine the initial conductivity, and the conductivity of the soaking solution was determined after the seeds had been immersed for 24 h at a constant temperature of 25 °C36. Each group of experiments was replicated three times.

$$\begin{aligned}&\text{Conductivity} ({\mu} S(\text{cm}\,\text{g})-1) = [(\text{conductivity}\, \text{difference}\, \text{in}\, \text{replicate}\, 1/\text{weight}\, \text{in}\,\text{replicate}\, 1) \\&+ (\text{conductivity}\, \text{difference}\, \text{in}\, \text{replicate} 2/ \text{weight}\, \text {in}\, \text{replicate}\, 2) \\&+ (\text{conductivity}\, \text{difference}\, \text{in}\, \text{replicate} 3/ \text{weight}\, \text{in} \text{replicate} 3)]/3\end{aligned}$$
(7)

Seed health determination

Seed external microorganism determination

Preparation of spore suspension: in each test group 100 seeds were randomly counted out and placed in a conical flask, to which 25 mL sterilized distilled water was added. The flask was shaken well, and the whole suspension was collected in a centrifuge tube at 2000 rpm for 10 min. The supernatant was discarded, and then 1 mL sterilized distilled water was added and shaken well to prepare a spore suspension.

Colony culture: 100 µL bacterial suspension was aspirated, inoculated into potato dextrose agar medium (90 mm) on an ultraclean bench, and placed in an inverted culture system at 28 °C for 5 days to observe the growth of colonies. The growing fungi were isolated, purified, and identified. Under the same operating conditions, sterile water was used as a blank control. The experiment was replicated three times.

Seed internal microorganism determination

Seed treatment: in each test group 50 seeds were randomly counted out, placed in a conical flask, soaked in 98% H2SO4 for 50 min, washed with sterilized distilled water three times, and dried.

Inoculation: the treated seeds were inoculated into potato dextrose agar medium (90 mm) on an ultraclean bench and cultured at 28 °C for 5 days. The growing fungi were isolated, purified, and identified. Under the same operating conditions, the last cleaning solution was used as a blank control.

Molecular identification of colonies

A fungal DNA extraction kit (Solarbio, Beijing, China) was used to extract DNA from an isolated and purified single colony. The universal fungal primers ITS1 (5’-TCCGTAGGTGAACCTGCGG-3’) and ITS4 (5’-TCCTCCGCTTATTGATGC-3’) were selected for PCR amplification of genomic DNA. The PCR system consisted of 1 µL DNA template, 1 µL forward primer, 1 µL reverse primer, and 1 µL 2×EasyTaq PCR SuperMix and was made up to 25 µL with ddH2O. The PCR reaction conditions were 94 °C for 3 min; 30 cycles of 94 °C for 30 s, 54/58 °C for 30 s, and 72 °C for 1 min; 72 °C for 5 min; and held at 4 °C.

After PCR amplification, 1% agarose gel was used to confirm the amplification results, and Sangon Biotech (Shanghai) Co., Ltd was commissioned to sequence the clear single PCR product. The valid data from successful sequencing were manually spliced, and the results of sequencing and splicing were analyzed by BLAST against the GenBank database to identify the species of the strains.

Statistical analysis

Excel 2010 (Microcal Software, Northampton, MA, USA) and SPSS 27.0 (IBM Corp., Armonk, NY, USA) were used for data analysis. Origin 2024 (Microcal Software, Northampton, MA, USA) and Photoshop 2021 (Adobe, San Jose, CA, USA) were used for generating charts and graphics. Machine learning is used to train models and predict outcomes37,38, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest, Elastic Net, Ridge, and Lasso.