Introduction

Rapidly growing world’s population requires urgent increasing and stabilizing agricultural production. The global food demand increase is projected to reach 35–62% by 20501. A range of measures are required to ensure a stable food supply and achieve sustainable agricultural production. These should include conserving land and water resources, and developing climate-resilient crop varieties2. Soil is a vital resource for food production, and understanding its health through appropriate soil improvement practices is a critical issue leading to sustainable agricultural production3.

Various types of organic compounds are present in the soil. Low molecular weight compounds include volatiles, carbohydrates, organic acids, and amino acids, whereas relatively high molecular weight compounds include phenolics, lipids, and secondary metabolites in the metabolome4. Volatile organic compounds (VOCs) are carbon-containing compounds that evaporate at normal temperature and pressure. Alcohols, carbonyls, alkanes, aromatics, terpenes, and sulfur-containing compounds are classified as VOCs in humans, insects, plants, and microorganisms5,6,7,8,9. Studies on soil VOCs has primarily focused on the microbial production of specialized metabolites and their physiological activities10, the types and distribution of VOCs produced by soil microorganisms and plants11, and VOCs in the rhizosphere12. Although some studies have focused on soil polar metabolites in fields and their application in soil condition assessment13, comprehensive analyses of soil VOCs in fields with growing crops remain unidentified.

Metabolomics is a promising field that is widely used to decipher the metabolic status in various organisms, including plants and microorganisms. Nuclear magnetic resonance (NMR)- and mass spectrometry (MS)-based analytical techniques have been applied in soil metabolomics for the evaluation of soil conditions13,14,15,16 and environmental assessment17,18. The soil volatilome, which is the total volatile compounds present in the soil, constitutes a subset of the soil metabolome19; however, only a few reports exist on soil volatilomics integrating agricultural field soil20.

Soybean (Glycine max), one of the world’s most important crops, serve as a pivotal source of protein feed and vegetable oil. Its global demand has markedly increased over the past several decades. Soybean plays a crucial role in Japanese cuisine, with uniquely processed products like tofu being integral to the country’s food culture. However, Japan faces challenges with low domestic soybean cultivation and unstable yields. Sustainable agriculture practices to minimize environmental impact while ensuring stable yields are essential to secure a stable soybean supply.

As healthy soil is crucial for soybean production, this study focused on the soil in soybean fields. We conducted soil volatilomics using gas chromatography–mass spectrometry combined with headspace-solid phase microextraction (HS-SPME–GC–MS) analysis to gain new insights into changes in VOC composition under different soil conditions, years, and locations. Comprehensive HS-SPME–GC–MS analysis captured seasonal variation in VOC composition under different soil conditions over three years. Notably, the presence of soybean plants influenced soil VOC levels. VOC profiles from the flowering period showed a strong correlation with other soil-related omics datasets but no correlation with those of the rhizosphere chemicals or the root microbiome dataset. Our findings suggest that soil VOC profiles hold significant promise for evaluating soil health in agricultural fields.

Results

Experimental design

This study employed comprehensive VOC profiling of soil from a soybean field to evaluate the potential of soil VOCs as indicators of soil health status. First, the soil depth was determined for sampling. As soybean root length was approximately 130 mm, a depth of 5–15 cm was chosen (Supplementary Fig. S1). Soil from two different plots was sampled in concentric circles at 40-cm intervals around the plants. According to the experimental design, soil samples with four different soil conditions from agricultural fields over three years were collected in Fukushima Prefecture, Japan. Soil samples were collected every week in 2019, and twice a week in 2020 and 2021 in Koriyama City. To broaden the study, soil samples were also collected in Fukushima City in 2021 (Table 1). Supplementary Table S1 and Supplementary Method S1 present more detailed information. The ratio of soil temperature at 10-cm depth to soil–water content reaches its relatively high value except for the flowering season in 2020 (Supplementary Fig. S2).

Table 1 Experimental design of the study related to soil conditions.

Samples were collected at locations marked with circles. Sampling was not conducted at locations marked with crosses. Soil samples were collected every week in 2019 and twice a week in 2020 and 2021.

Soybean plants were grown in agricultural fields with different soil conditions. Each soil condition: “no soybean” represents the unfertilized condition without soybean; “chemical” represents the chemical fertilized condition with soybean; “no fertilizer” represents the unfertilized condition with soybean; and “cow manure” represents soybean fertilized with cow manure.

Overview of soil VOC profiles over the 3 years

Overall, 200 VOC peaks were detected after conducting VOC profiling of approximately 150 soil samples over a 3-year period. Of these, 36–62 compounds were tentatively identified by referring to the 6 reference libraries of known compounds (Supplementary Tables S3S5, Supplementary Method S1). According to their chemical structures, the annotated compounds were classified into nine classes: alcohol, aldehyde, alkane, aromatic, ester, fatty acid, heterocyclic, ketone, and terpene (Fig. 1).

Fig. 1
figure 1

The proportion of annotated compound classes detected in agricultural field soil between 2019 and 2021. The number in parentheses represents number of annotated compounds. Total number of detected peaks: 287 in 2019; 181 in 2020; and 314 in 2021. Number of replicates for the analysis: n = 176 in 2019; n = 160 in 2020; and n = 146 in 2021.

Throughout the 3 years, the five most abundant compound classes (alcohol, aldehyde, alkane, aromatic, and terpene) accounted for more than 80% of the total. Eleven compounds (1-hexanol, styrene, (Z)-2-heptenal, benzaldehyde, octanal, nonanal, m-dimethoxy benzene, naphthalene, tetradecane, 2,4,7,9-tetramethyl-5-decyn-4,7-diol, and pentanoic acid, 2,2,4-trimethyl-3-carboxyisopropyl, isobutyl ester) were commonly detected over the three years. Aromatic hydrocarbons, such as stylene and naphthalene, are thought to be soil contaminants21, but some microbes produce these compounds via enzymatic reactions22,23. Four compounds namely, 2-ethyl-1-hexanol, undecane, 2,4-di-tert-butylphenol, and epicubenol were commonly detected in soil samples collected between 2019 and 2020. Seven compounds namely, hexanal, 1-octen-3-ol, p-cymene, 1-nonanol, (−)-borneol, α-terpineol, and nonanoic acid, were commonly detected between 2019 and 2021. Five compounds namely, 3-heptanone, decane, (−)-camphor, heptadecane, and hexadecanoic acid, methyl ester, were commonly detected between 2020 and 2021. Unique VOCs were detected each year: 13.5% (14 compounds) in 2019, 24% (25 compounds) in 2020, and 36.5% (38 compounds) in 2021. These finding indicate annual diversity in terms of chemical composition in soil VOCs.

Seasonal variation in soil VOC profiles under varying soil conditions

We investigated changes in VOC profiles under four varying soil conditions in each year: the unfertilized condition without soybean (no soybean), the chemical fertilized condition with soybean cultivation (chemical), the unfertilized condition with soybean cultivation (no fertilizer), and the cow manure fertilized condition with soybean cultivation (cow manure) (Fig. 2). During the flowering period, an increase in the cumulative levels of VOC profiles, twice as high as usual, was observed each sampling year under conditions fertilized with cow manure. A similar trend in the VOC profile changes was observed even in the unfertilized condition with soybean cultivation (Supplementary Fig. S3). In contrast, the relative VOC content was low during the season in non-soybean plants in the agricultural field (Fig. 2). In 2020 and 2021, around the flowering period each year, the cumulative levels of soil VOCs in the “cow manure” treatment were twice as high as those in the “no soybean” treatment. To assess whether the presence of soybean plants affects soil VOC accumulation across different soil qualities, we conducted soil VOC profiling in an agricultural field in Fukushima City, Japan (Supplementary Method S1). The ~ 40 km proximity of Fukushima City to Koriyama City suggests similar weather conditions between the two locations. A comparison of the soil VOC profiles between Koriyama (andosol soils) and Fukushima (lowland soils) showed that the cumulative levels of the VOC profiles in the absence of soybean plants were very low (Supplementary Fig. S4). These results suggest that the presence of soybean plants during the flowering period may contribute to soil VOC accumulation.

Fig. 2
figure 2

Comparison of compound accumulation in soil under varying conditions. Soybean-grown soil with cow manure in 2019 (A), 2020 (B), and 2021 (C). Soil without soybean plants in 2020 (D) and 2021 (E). All soil samples were collected at the Koriyama field. Colors in each stacked bar chart represent the compound class. Black arrows represent the flowering period. DAS day after sowing. For detailed information on the sampling date, see Supplementary Table S2.

Soil VOC signature to differentiate the presence of soybean plants in an agricultural field

To compare soil VOC profiles between the “soybean” and “no soybean” conditions, we focused on the 18 VOCs that were commonly detected in soil samples collected in 2020 and 2021 throughout the period, from before sowing soybean seeds to harvesting in the Koriyama field (Table 2, Supplementary Fig. S5). Principal component analysis (PCA) revealed five outliers in the principal component (PC) 1 direction. These outliers were collected in the summer of 2021 (Supplementary Fig. S6). The PC2 direction could explain the differences in the sampling years.

Table 2 Common VOCs found in agricultural soil profiles in 2020 and 2021.

Next, we conducted a more detailed analysis of the soil VOC profiles during the flowering period in the two subsequent years after removing the outliers. A two-way orthogonal partial least squares-discriminant analysis (O2PLS-DA) score scatter plot differentiated the soil VOC profiles with and without cultivating soybean plants (Fig. 3). Analysis of the O2PLS-DA loading column plot revealed a statistically discriminative soil VOC, pentanoic acid, 2,2,4-trimethyl-3-carboxyisopropyl isobutyl ester (hereafter referred to as “VOC pentanoate”), which contributes to the separation of samples from soybean-planted plots along the positive PC1 axis. This VOC was detected in soil over the three years 2019–2021.

Fig. 3
figure 3

Discriminating factors in soil VOC profiles with or without soybean plants in the flowering period in the Koriyama field. A two-way orthogonal partial least squares-discriminant analysis (O2PLS-DA) score scatter plot (A) and loading column plot (B). The structure of Peak 19, pentanoic acid, 2,2,4-trimethyl-3-carboxyisopropyl, isobutyl ester (C). In (A), the yellow circle represents soil samples without cultivating soybean plants, while the purple square represents soil samples with soybean plants. Observations = 44, variables = 20, including two dummy variables. Significant, analysis of variance testing of cross-validated predictive residuals (CV-ANOVA), pCV < 0.05. t[1], the first X-score; to[1], the first orthogonal X-score; pq[1], X loading weight p and Y loading weight q combined into one vector.

Relationships between the soil VOCs and other omics datasets by correlation analysis

Thus far, the results suggest that soil VOC profiles can serve as a valuable indicator of soil status. To assess the potential of soil VOCs for soil status evaluation, correlation analysis was performed between the soil VOC profiles and six soil omics datasets such as, soil ionome, soil microbiome, root microbiome, soil metabolome, rhizosphere chemicals, and soil physics. Before the correlation analysis, we conducted PCA using the soil VOC profiles in 2021 during the flowering period in Koriyama and Fukushima cities. As a result, PC1 to PC8 were extracted. Subsequently, PCA was also performed using other soil omics datasets to reduce the dimensionality of each dataset. UniFrac-weighted principal coordinate analysis was performed for the soil and root microbiome datasets to account for the phylogenetic relationships between microbes. The correlation analysis results between PC1–PC8 of the soil VOC profiles and the PC1 of each of the six soil omics datasets clearly demonstrated that the PC1 of the soil VOC profiles was significantly correlated with the PC1 of each soil-related omics dataset but not with the PC1 of the 14 rhizosphere chemicals and the root microbiome dataset (Fig. 4). The PC1 of the soil ionome data (19 elements), the soil microbiome data (Supplementary Fig. S7), and 110 annotated soil metabolome data, mainly comprising primary metabolites such as sugars, amino acids, and organic acids, was positively correlated with the PC1 of the VOC profiles. Conversely, negative correlations were observed between the PC1 of the VOC profiles and that of the soil physics dataset, including 11 characteristics nemly, hydraulic conductivity, dry bulk density, porosity, water at pF1.8, water at pF3, total carbon (C), total nitrogen (N), mean weight diameter, mean soil hardness, maximum soil hardness, and penetration depth. PC1 loadings were derived from the four PCAs for the VOC profiles, soil ionome, soil metabolome, and soil physics datasets (Supplementary Figs. S8, S9). Styrene was the top contributor in the soil VOC profile data, followed by VOC pentanoate (Fig. 5). The top three factors showing positive correlations with the soil VOC profiles in the soil ionome and the soil metabolome datasets in each loading column plot were Al, Rb, and Ba in the soil ionome data and isocitrate, xylulose, and 1,6-Anhydro-β-d-glucose in the soil metabolome data. PC1 showed positive loadings for nine factors, while the three negative loadings were soil hardness and density-related factors in the soil physics dataset.

Fig. 4
figure 4

Correlation plot of principal components (PCs) 1 to 8 of the VOC dataset with PC1 of each dataset (soil ionome, soil microbiome, root microbiome, soil metabolome, rhizosphere chemicals, and soil physics). Proportion in parentheses represents the contribution ratio in each PC1. Significant, false discovery rate (FDR) < 0.05. *FDR < 0.05, **FDR < 0.01 and ***FDR < 0.001.

Fig. 5
figure 5

The PCA column loading plot of the soil VOC dataset during the flowering period in 2021. For the VOC information, see Supplementary Table S5.

Comparison of microbial communities in soil and roots

Since the soil VOC profiles and the soil microbiome dataset but not the root microbiome dataset showed a significant correlation during the flowering period in 2021, we investigated what kinds of microbes were likely to be contributed for the soil VOCs production in the soil microbiome. In microbial communities at the genus level, the Venn diagram of microbial communities in soil and roots showed that the number of genera in soil microbiome (560 genera) was greater than that in root microbiome (160 genera) (Supplementary Fig. S10). The 146 genera were commonly found in the soil and root microbes including Streptomyces and Bradyrhizobium. Our analyzes provide a snapshot of the microbial diversity in soil and roots as in Fig. 6. Comparison of microbial profiles at the genus level between soil and roots showed that the microbial compositions of soil and roots were different (Fig. 6A). The root microbiome was dominated by the Xanthobacteraceae family, particularly a genus Bradvrhizobium. On the other hand, the soil microbiome was much diverse. Approximately 25% in the soil microbiome was Acidobacteria, a phylum of Gram-negative bacteria.

Fig. 6
figure 6

Comprehensive analysis of bacterial microbiome composition and relative abundance in soil and root samples. Stacked bar plots of relative abundances of microbiome composition at the genus level (A). Genera with relative abundances less than 2% were referred to “others.” The volcano plot of the 146 common genera in soil- and root microbes (B). After p value corrections, those taxa remained to be significant (FDR < 0.05) are written in red dots, and microbial species with FDR > 0.05 are written in gray dots. The dashed line colored in gray represents FDR = 0.05. Comparison was made between the soil microbial species and the root microbial species.

Among the 146 common microbial species, the abundance of 53 genera differed significantly (FDR < 0.05) between the soil and root samples (Fig. 6B). Of the 53 genera, 51 were significantly more abundant in the soil microbiome, while 2 genera (Streptomyces and Bradyrhizobium) were more prevalent in the root microbiome. Among the 51 genera, the top 10-annotated soil microbes at the genus level found in KEGG GENOME (https://www.kegg.jp/kegg/genome/) were Solibacter (log2-fold change (FC) = 7.49), Gemmatimonas (log2FC = 7.42), Nordella (log2FC = 5.67), Rhodanobacter (log2FC = 5.29), Conexibacter (log2FC = 4.98), Nocardioides (log2FC = 4.91), Bacillus (log2FC = 4.37), Mycobacterium (log2FC = 4.32), Ferruginibacter (log2FC = 4.23), and Arenimonas (log2FC = 4.23). These microbes have various and complex metabolic pathways, implying that soil microbes could be involved for production of soil VOCs. Bradyrhizobium also have complex metabolic pathways to produce secondary metabolites. However, its contribution to the production of the VOCs detected in this study was considered to be low. This is because Bradyrhizobium was the major microbe species in the root microbiome and its abundance was significantly higher in the root microbiome compared to the soil microbiome (log2FC = -6.65). Interestingly, despite this dominance, there was no significant correlation between Bradyrhizobium (Proteobacteria) abundance and VOC production (Supplementary Fig. S7).

Discussion

Soil volatilomics reveals varying VOCs in agricultural fields

Comprehensive soil VOC profiling, also known as soil volatilomics, was conducted to assess the potential of soil VOCs as prominent indicators for predicting soil quality in agricultural fields. GC–MS, along with direct injection Fourier-transform ion cyclotron resonance MS (FT-ICR–MS) and NMR, constitute a powerful set of techniques for analyzing soil VOCs19. Given its ability to comprehensively measure all m/z information, making it ideal for metabolomics analysis, GC–time-of-flight (TOF)–MS was employed in this study. The HS-SPME method enabled solvent-free sampling of VOCs24,25,26,27,28. Using HS-SPME–GC–TOF–MS could detect ~ 200 soil VOCs in soybean fields, among which, ~ 50 compounds were annotated over the 3 years (Fig. 1, Supplementary Tables S3S5). A three-year analysis of soil VOCs revealed 1-hexanol (alcohol) in Vicia faba plants29, sweet cherries30, and soil bacteria31. This VOC demonstrated antibacterial and plant growth-promoting activities31,32. Three aldehydes ((Z)-2-heptenal, octanal, and nonanal) were found in the essential oils of Eurya japonica and vegetable oils33,34. Bean plants (Phaseolus vulgaris) can emit these aldehydes to enhance fungus resistance35. Benzaldehyde can be biosynthesized by plants and soil bacteria36,37.

Identifying the origin of soil VOCs is a challenging task because they can be emitted from plants, microorganisms, and soil pollutants. Isotope-labeling experiments combined with sophisticated analytical techniques have revealed that soil microbes are responsible for soil VOC emissions, with VOC production being higher during drought18,38. Our results indicated that soil VOC levels were markedly higher during the flowering period when soybean plants were present, compared with other stages in Fukushima (Fig. 2A–C). The average air temperature peaked at the flowering period, while volumetric water content in soil was lower at the flowering period than at the harvesting period in 2021 (Supplementary Fig. S2). This observation suggests that the soil condition was relatively dried at the flowering period. These findings imply that soil microbes contribute to soil VOC production, particularly during the flowering period. In contrast, different fertilizer treatments, even no fertilizer treatment, did not affect the soil VOC abundance (Fig. 2D,E, Supplementary Fig. S3). Composition of soil microbial communities with chemical fertilizer and cow manure was similar at the flowering period in 2021 (Fig. 6A). Microbial activities, such as organic matter decomposition and the production of new substances, significantly influence soil properties. It can be inferred from these results that the soil VOC profiles and the soil microbiome dataset were positively well-correlated.

Thriving soybean plants contribute to soil VOC accumulation

The increased cumulative soil VOC levels observed during the flowering period when soybean plants were present in fields (Fig. 2) suggest that soybean plants contribute to soil VOC accumulation in fields. This contribution stems from three possibilities: (1) plants directly emit VOCs into the soil atmosphere, (2) soil microorganisms produce VOCs in response to the presence of plants, and (3) microbe-induced plant VOCs can be emitted from plant roots9. The 2020–2021 analysis of soil VOC profiles revealed VOC pentanoate as a potential biogenic VOC, although we could not pinpoint the specific source(s) of the increased soil VOCs, including VOC pentanoate (Fig. 3). VOC pentanoate, also known as 1-isobutyl 4-isopropyl 3-isopropyl-2,2-dimethylsuccinate, has been identified in various sources, including bryophytes 39, blood of tumor-bearing mice24, pears40, Beka (a culinary herb used in Malaysia)41, and fermented herbs42. This is the first report to detect this compound as a soil VOC in agricultural fields. However, it may not be sufficient to assess overall soil quality using only a specific VOC, such as VOC pentanoate. Because soil is a complex system composed of inorganic and organic matter, water, air, and a diverse community of microorganisms. Further analysis of VOC pentanoate in various soil types and locations could provide new insights into the role(s) of biogenic VOCs, including VOC pentanoate, in soil.

Soil VOC profiles exhibited a close association with soil omics data

Soil VOC profiles could serve as an indicator of soybean plant presence during the flowering period (Fig. 3). We analyzed the relationships between soil VOC profiles and six soil omics datasets during the flowering period (Fig. 4). The composition of the soil metabolome is affected by soil conditions and its treatments13,43,44. Integrated soil ionome and metabolome profiles have been shown to differentiate soil status14. In this study, soil VOC profiles exhibited strong positive correlations with both soil ionome and metabolome profiles. These findings suggest that soil VOC profiles, along with soil ionome and metabolome profiles, can serve as diagnostic indicators of soil status.

The soil VOC profiles showed a positive correlation with soil metabolome data but not with the 14 rhizosphere chemicals, including specialized metabolites produced by soybeans like soyasaponins and daidzein. The soil metabolome was primarily composed of primary metabolites, with carbohydrates and amino acids being the most abundant. Both microorganisms and soybean plants can produce these metabolites though we could not determine contribution of primary metabolite production from microorganisms and soybean plants.

Soil VOC profiles were obtained from the soil collected at 40 cm intervals around the soybean plants. In contrast, rhizosphere soil was collected by gently brushing the roots to remove the soil in this study. The rhizosphere, a zone of intense plant–microbe interactions, markedly contributes to plant growth and health45. Volatiles diffuse through water containing minerals and water-soluble metabolites and air spaces of soil pores. The rhizosphere tends to contain relatively more polar VOCs compared to nonpolar VOCs9. The low polarity and high hydrophobicity of most soil VOCs detected in this study can be attributed to their molecular structure, which consists primarily of a hydrocarbon skeleton with one oxygen atom and lacks other heteroatoms (Table 2, Supplementary Tables S3S5). The translocation of volatiles through soil is governed by a complex interplay of chemical properties, soil physical conditions, and biological processes. The lack of correlation between the soil VOC profiles and the root microbiome data suggests that different factors may be driving VOC production and/or mobilization in bulk soil and roots.

Soil VOC profiles showed negative correlations with the soil physics dataset. Total C and N were negatively correlated with 34 VOCs, including VOC pantothenate. In moist soil conditions, the soil can act as a sink to absorb soil VOCs (Supplementary Fig. S2)18,46. In this study, a negative correlation was observed between soil density and VOC profiles in the soil physics dataset (Supplementary Fig. S8). This suggests that as soil becomes harder, its porosity decreases, hindering VOC diffusion and potentially reducing VOC absorption. However, this does not necessarily imply that VOCs are released under dry conditions. VOC emission is influenced by a complex interplay of factors including soil moisture content, temperature, microbial activity, and soil organic matter content. Correlation analysis between soil VOC profiles and the soil physics dataset revealed positive correlations for 28 VOCs (Fig. 5, Supplementary Fig. S8B). Further investigations are necessary to elucidate the relationships between C, N, and VOCs in soil.

Soil microbes could be counted as a contributor for production of VOCs in soil

Our findings clearly suggest that soil VOC profiles are more closely linked to the overall soil microbial community than to the root microbial community (Fig. 4). The diversity and abundance of the soil microbial community was much greater than that of the roots (Fig. 6A). The root microbial community was primarily composed of Streptomyces and Bradyrhizobium. Soybean plants have evolved a symbiotic association with Bradyrhizobium bacteria, enabling them to fix atmospheric nitrogen. Despite the importance of Bradyrhizobium in soil ecosystems, its VOC emissions have not been characterized. Bradyrhizobium comprised approximately 3% of the soil microbe community, indicating that its contribution to VOC production in the soil may be limited. In the case of Streptomyces, 120 VOCs have been reported to be produced by this genus (Fig. 6A)47. However, similar to Bradyrhizobium, the contribution of this microbe to VOC production may be low.

In the soil microbial community found in the study, Nocardioides, Bacillus, and Mycobacterium have been reported to have the ability to emit VOCs48,49,50. This suggests that the identified soil microbes could contribute to VOC production. While this study provides valuable insights into the potential contribution of specific soil microbes to VOC production, further research is needed to quantify their individual contributions and to explore the impact of environmental factors on VOC emissions. Additionally, the volatilome of other soil microbes should be investigated to gain a more comprehensive understanding of the soil volatilome.

Conclusion

Soil volatilomics reveal a strong correlation between soybean presence and soil-related omics datasets. Current agricultural practices are contributing to ongoing soil degradation, which can cause significant environmental harm and adversely impact food production in future.Investment in sustainable soil management practices is required to reverse the soil degradation. This study contributes to understanding soil status by incorporating the soil metabolome, including soil VOCs, which is crucial for ensuring future global food security.

Materials and methods

Chemicals

The n-alkane standard solutions C8–C20 for determining the retention index were purchased from SIGMA-ALDRICH (US). FUJIFILM Wako Pure Chemical Corporation (Osaka, Japan) supplied other chemicals.

Soil sampling

We used the soybean cultivar “Satonofofoemi” for the study. Four soil conditions in agricultural fields were applied: soil without soybeans and soil with soybeans under three fertilized conditions (Table 1). According to root length (Supplementary Fig. S1, Supplementary Table S1), soil samples (5–15 cm in depth) were collected for sampling at “before sowing,” “flowering periods,” and “harvesting periods (Supplementary Table S2). Two replicates were sampled in the plots at north and south in the Koriyama and Fukushima fields. Detailed information is summarized in Supplementary Method S1. The collected samples were stored in a biofreezer (Nihon Freezer, Japan) at − 40 °C until use.

VOC collection and profiling using the HS-SPME method with GC-TOF–MS

VOCs in soil were extracted using the HS-SPME method28 with modifications. About 3 g of soil was used in the study. We used SPME fiber (50/30 μm, DVB/CAR/PDMS, Stable Flex, 23 Ga, Supelco, US). Solid NaCl (0.9 g) and 3 ml of 0.1 mol/l EDTA (Ethylenediaminetetraacetic acid, pH 7.5) were added in MilliQ water to 20 ml-size of each SPME vial (Agilent Technologies, US). A 2EPA524.2 fortification solution (5 mg/l, Supelco, US) was used as an internal standard (IS). The SPME device for a CTC CombiPAL auto-sampler (CTC Analytics, Switzerland) was utilized for injection into the GC instrument. Before analysis, the fiber was conditioned at 250 °C for 30 s in the injection port of an Agilent 6890 N gas chromatograph (Agilent Technologies, US) equipped with a 30 m × 0.25 mm inner diameter fused-silica capillary column with a chemically bound 0.25-μl film Rxi-5 Sil MS stationary phase (RESTEK, US). After HS collection, it was placed in the injection port of the gas chromatograph coupled to a Pegasus III TOF mass spectrometer (LECO, US). The thermodesorption of VOCs occurred for 10 min at 270 °C.

GC-TOF–MS conditions were as described in Kusano et al.51 and Supplementary Method S1. Data acquisition was performed on a Pegasus III TOF mass spectrometer (LECO, US).

Soil ionome analysis

The soil samples were freeze-dried and grounded, and then subjected to ammonium acetate extraction, hydrochloric acid extraction, Bray II extraction, and water extraction. In ammonium acetate extraction, 5 ml of 1 M ammonium acetate solution (pH 7.0) was added to 1 g soil and mixed for 1 h, and filtrated. The extractants were diluted appropriately and measured with inductively coupled plasma (ICP)–MS (NexION 350S, PerkinElmer, US) to determine exchangeable (Ex) macroelements: Ca, K, Mg, and Na. The microelements (Al, As, B, Ba, Cd, Co, Cs, Cu, Fe, Mn, Mo, Ni, Rb, Sr, and Zn) were extracted by 0.5 M HCl in a soil:solution ratio of 1:5 for 1 h, diluted, and measured with ICP–MS. The soil P concentration was determined according to the Bray II extraction followed by a colorimetric measurement using molybdenum blue52. The water extraction was performed to determine soil pH, nitric acid content, and electrical conductivity (EC). The supernatant of the mixture of 1 g of soil and 5 ml of ultrapure water was collected. The pH was measured with pH meter (LAQUA pH/ION meter, HORIBA, Japan). The nitric acid concentration and EC were determined by using LAQUA twin series (HORIBA) according to the method described in the instruction manual.

Soil and root microbiome analysis

Bulk soil samples were freeze-dried, frozen, and extracted to isolate DNA. The library for sequencing was prepared through a two-step polymerase chain reaction (PCR) amplification, focusing on the V4 region of the bacterial 16S rRNA gene using 515f and 806rB primers14,53. After The library for sequencing, bioinformatic and statistical analyses were performed using the Quantitative Insights Into Microbial Ecology 2 software. After collecting rhizosphere soil (described later), the roots were washed with tap water and lateral roots located about 10 cm from the above-ground part of the plant were sampled. The root sample contains both epiphytes and endophytes. For more detailed information on the experimental procedures, see Supplementary Method S2.

Soil metabolome analysis

We used 3 g of soil for the experiment. Soil was extracted with milli-Q water with the ISs. The equivalent of 0.25 g per ml was extracted for 1 h at 110 ℃ with milli-Q water. After centrifugation for 10 min at 1000g, and at room temperature, supernatant was feezed under − 80 ℃, then lyophilized to remove milli-Q water. We conducted GC–TOF–MS profiling according to Kusano et al.28,51,54. For more details, see Supplementary Method S3.

Rhizosphere chemical analysis

Rhizosphere soil, defined as soil that adheres to the plant root after gentle shaking, was obtained using sterile brushes, and root debris was removed manually. Isoflavone and soyasaponin extraction from the rhizosphere soil was conducted using a previously described method with some modifications55. In brief, lyophilized soil samples (approximately 100 mg) were extracted in 3 × 1 ml of methanol at 50 ℃ (10 min each) and centrifuged at 5000g for 5 min. The combined supernatant was dried under a nitrogen stream at 50 ℃, re-dissolved in 100 µl methanol, and filtered through a 0.45 µm Minisart RC4 filter (Sartorius). The extracts were analyzed by liquid chromatography (LC)–MS as described previously55.

Soil physics analysis

Intact soil samples taken by either 100 cm3 or 50 cm3 stainless cylinder were used for hydraulic conductivity and water retention characteristics. Hydraulic conductivity was measured under a constant water head method. After the measurements, the samples were oven-dried, obtaining dry bulk density. Porosity was calculated based on the dry bulk density and particle density which was separately measured using the disturbed soil samples. Water retention curves were measured using water hanging method and pressure plate method. Volumetric water content at pF 1.8 (= 63 cm H2O) and pF 3.0 (= 1000 cm H2O) were used for the data analysis. The disturbed soil samples were used for measuring mean weight diameter (MWD) of soil aggregates which was calculated based on the aggregate size distribution obtained by wet sieving of soil samples. The total carbon and nitrogen contents were also measured by a combustion method (vario EL cube, Elementar, Germany). Soil hardness profiles were measured by a cone penetrometer (DIK-5532, Daiki, Japan). Based on the soil hardness profile up to the penetration depth (maximum 60 cm in depth), mean and maximum soil hardness were calculated. Total C and N content in each soil sample was measured by a combustion method using an elemental analyzer (vario EL cube; Elementar).

Data analysis

Raw data were exported in the network common data form file format using LECO ChromaTOF software (version 2.32) and then processed with the high-throughput data analysis method56. We obtained the normalized response by calculating the signal intensity of each metabolite and dividing the mass-detector response by the unique internal standard peak area. Metabolites were tentatively identified according to the guidelines for metabolite identification57. SIMCA-P + 14.0 software (Umetrics AB, Umeå, Sweden) and R packages in the R environment was used for the statistical analysis. For details, see Supplementary Methods S1 and S2.