Introduction

Microbial communities are pivotal to ecosystem functions1, yet they face persistent environmental fluctuations across diverse habitats2. These changes can dramatically impact community functioning and stability3, where stability is defined as the tendency of a community to return to its original state after a disturbance4. Stability encompasses both resistance (the ability of a community property to remain unchanged or insensitive to disturbance) and resilience (the capacity to recover after disturbance and return to pre-disturbance conditions)4,5. Recovery typically follows one of four extreme scenarios: full recovery, physiological adaptation, functional redundancy, or complete loss of function with no recovery6. Despite advances, the molecular mechanisms underlying microbial community recovery remain poorly understood, particularly how individual organism strategies integrate with community-level processes to ensure resilience (5,7,8).

Microbial resilience emerges from both individual organisms’ survival strategies and collective community-level coordination5,7,9. At the individual level, adaptations provide the mechanistic basis for survival during extreme fluctuations10,11,12,13. For instance, dormancy allows persistence during unfavorable conditions and rapid reactivation when conditions improve14,15, ensuring long-term population survival16. Upon rewetting, dormant microbes prioritize DNA repair and energy generation15, while others maintain consistent transcription of stress response and nutrient acquisition genes during dry periods17. Additional mechanisms, such as osmolyte accumulation and oxidative stress further protect cellular integrity and enhance persistence18. At the community level, functional redundancy has been shown to buffer ecosystem processes against taxonomic shifts or environmental fluctuations6. While these individual and community-level mechanisms are both well-documented, their integration remains poorly understood.

Recent studies reveal that individual and community responses in arid systems often operate as disconnected processes, functioning at different scales and timescales. For instance, while drought caused >90% of bacterial and archaeal taxa to stop dividing, growing taxa still accounted for 4% of the total community, revealing critical disconnects between individual survival mechanisms and community-level functioning19. Similarly, despite shifts toward more drought-tolerant bacterial groups at the individual organism level, community-level enzyme activities in rhizosphere hotspots remain stable through collaborative relationships rather than compositional changes20. Furthermore, communities in arid and semi-arid ecosystems show little accumulation of osmolytes under stress, with individual energy-saving strategies like dormancy failing to translate to predictable community-level carbon cycling responses21.

These findings highlight fundamental gaps in understanding how individual organisms adaptations integrate with community-level processes, particularly during abrupt environmental transitions22. Advancing understanding requires integrated approaches bridging individual traits and community dynamics. The synthesis of dryland microbiome studies reveals that resilience to desertification emerges not from one scale of organization alone, but from coordinated responses across multiple levels13,23, underscoring the need for multi-omics strategies to unravel how microbial communities maintain function amid intensifying environmental perturbations. Understanding this multi-level response is particularly crucial as climate change intensifies environmental perturbations.

Arid environments offer a unique laboratory for studying microbial resilience. Characterized by extreme temperatures, UV radiation, and variable precipitation22,24, these ecosystems represent critical systems for understanding adaptation25,26. As these regions expand due to climate change27,28, developing integrative frameworks will be essential for predicting ecosystem responses. However, despite the urgent need for such an integrative understanding, few studies have simultaneously characterized both individual microbial adaptations and community-level processes in arid environments during environmental fluctuations.

To address this knowledge gap, we employed an integrated time-resolved multiomics approach analyzing bare soil samples from Arizona’s Saguaro National Park, an iconic desert ecosystem characterized by extreme wet-dry cycles. By tracking individual-level strategies (traits identified at the level of MAGs) alongside community composition and metabolic outputs, we reveal that microbial resilience emerges from a sophisticated balance between community stability and functional adaptability. While stochastic processes maintain stable community composition, organic matter profiles show deterministic shifts reflecting metabolic adaptations. The key mechanism is microbial network reorganization, where interaction patterns shift without major changes in community composition. This allows microbial communities to maintain their taxonomic structure while supporting coordinated individual or population-level responses, directly addressing the integration challenges identified in previous studies. This dynamic is exemplified by ammonia-oxidizing Thermoproteota, whose flexible gene expression enables different network roles while preserving critical ecosystem functions. Our findings advance understanding of microbial resilience in expanding drylands and provide a framework for predicting ecosystem responses to climate change. Our integrative approach comprehensively characterizes both microbial communities and their metabolic outputs, revealing how coordinated responses across multiple organizational levels maintain biogeochemical cycling under extreme conditions, providing insights critical for ecosystem management as arid regions continue to expand globally.

Results

Monsoon events create distinct physicochemical states in arid soils

The North American Monsoon drives shifts in resource availability across desert ecosystems. We captured these transitions with temporal sampling at the Saguaro National Park (Fig. 1A). Soil samples were collected from a depth of 10–15 cm at four sampling sites (biological replicates) across eight time points from May to September 2021, plus one in May 2022 (n = 36, 4 per time point), capturing the monsoon cycle from pre-monsoon drought through post-monsoon recovery (Fig. 1B and Supplementary Fig. 1a). While transitions followed historical monsoon patterns29, 2021 saw some of the highest precipitation in the last decade (Supplementary Fig. 1b), offering a natural experiment to examine microbial responses to extreme environmental transitions.

Fig. 1: Overview of sampling location and environmental conditions during the 2021 monsoon season.
figure 1

A Geographic location of soil sampling sites within Saguaro National Park. B Pictures taken at the sampling site during the sampling campaign in May, July, and August of 2021. C Temporal dynamics of environmental and physicochemical factors throughout the 2021 monsoon season. Background indicates whether the sample was taken before, during or after the 2021 monsoon season. D Principal component analysis of the collected samples (n = 36) based on the measured environmental factors. Ellipses indicate whether the sample was taken before, during or after the 2021 monsoon season.

The shift from drought to intense rainfall created distinct temporal patterns in soil, marked by changes in moisture availability, temperature, and potential resource inputs (Fig. 1C, Supplementary Data 1 and Supplementary Fig. 1c). Principal component analysis (PCA) revealed three distinct sample clusters corresponding to pre-monsoon (yellow ellipse), monsoon (blue ellipse), and post-monsoon (red ellipse) periods (Fig. 1D). Post-monsoon samples gradually returned to pre-monsoon conditions, highlighting the cyclical nature of these transitions (Fig. 1C). Moisture content and vegetation density (measured by the Normalized Difference Vegetation Index—NDVI) emerged as primary drivers, explaining 30.47% of observed variance along the first component, while air temperature accounted for 19.66% along the second component. Hierarchical clustering analysis of physicochemical parameters further supported these distinct environmental states between monsoon and non-monsoon periods (Supplementary Fig. 1c).

Community-level responses emerge from local adaptation

To understand microbial responses to monsoon transitions, we employed a multiomics approach that combined microbial taxonomic profiling (using amplicon sequence variants (ASVs) derived from 16S rRNA amplicon sequencing and operational taxonomic units (OTUs) derived from shotgun metagenomics) with organic matter characterization using Fourier-transform ion cyclotron resonance mass spectrometry (FTICR-MS).

Our analysis indicated that microbial community composition remained relatively consistent throughout the monsoon season, despite environmental fluctuations (Fig. 2A, D; Supplementary Fig. 2a, b, h; Supplementary Note 1). Beta diversity analysis based on Bray-Curtis dissimilarity showed that time of collection did not explain significant variation in community structure (PERMANOVA month: R = 0.21, p = 0.947; Fig. 2D and Supplementary Fig. 2b). This compositional consistency is further supported by visually stable phylum-level profiles over time (Fig. 2A) and a high community stability index (Supplementary Fig. 2h). The observed compositional stability reflects resistance to environmental fluctuations, aligning with prior studies suggesting that arid soil microbial communities are adapted to endure dry-wet cycles30 through ecological memory31 and long-term optimization to historical moisture regimes32. However, alpha diversity metrics presented a more dynamic response (Fig. 2F). Unlike previous reports of stable or increased microbial diversity post-precipitation26,33, we observed a temporary decline in microbial diversity followed by recovery. This decline likely results from mechanisms such as differential growth triggered by increased water availability34, formation of anoxic microsites35, and osmotic stress-induced mortality36. Comparison of metagenomics read signatures with other global arid ecosystems revealed that while each ecosystem harbors locally adapted communities, Sonoran Desert more closely resembled those from North American deserts (e.g., Mojave) than geographically distant systems (e.g., Negev) (Fig. 2C), suggesting local adaptation shapes community composition while shared regional species pools support resilience37,38.

Fig. 2: Microbial community and organic matter profiling of arid soil samples.
figure 2

A Average relative abundance of the 12 most abundant microbial classes, from OTUs inferred with SingleM, at each sampling time (n = 4). B Average relative abundance of observed elemental compositions based on FTICR-MS data collected in positive mode at each sampling time (n = 4). C Non-metric multidimensional scaling (NMDS) ordination of different metagenomics datasets collected from arid soils based on MASH distances. D Principal coordinate analysis (PCoA) of shotgun metagenomics data (n = 28), based on Bray-Curtis dissimilarities calculated at the SingleM OTU level. E NMDS of FTICR-MS data (n = 28). F Alpha diversity indices (observed richness and Chao1 richness) inferred from SingleM OTUs and FTICR-MS metabolites abundances, respectively. Boxes represent the upper and lower quartiles, the line in each box represents the median and the whiskers represent the maximum and minimum values, no further than 1.5 times the interquartile range; values beyond the whiskers represent outliers and are plotted individually.

FTICR-MS analysis detected 20,971 different masses (m/z) across all samples, with 4175 assigned putative molecular formulas. Unlike lignin-dominated peatlands39,40, these arid soils showed predominance of lipid-like compounds, followed by protein- and lignin-like metabolites (Supplementary Fig. 2d and Supplementary Note 1), which suggests that microbial communities may have a strong influence on soil organic matter composition as reported previously41,42.

Organic matter composition showed significant temporal variation (NMDS: stress = 0.11; PERMANOVA month: R2 = 0.29, p = 0.028, Fig. 2E), particularly in N- and S-containing compounds (e.g., CHON, CHONS, CHONP) (Fig. 2B). During wet periods, osmotic stress-induced cell lysis may have released intracellular metabolites36, and increased substrate accessibility may have enhanced microbial activity43. Though some lipid-like compounds remained stable throughout the monsoon season, amino sugar abundance increased with water availability. These amino sugars, likely derived from microbial components44 may have been accumulated due to osmotic stress-induced mortality35. The subsequent increase in carbohydrates by the end of the monsoon season (September and August) likely reflected post-monsoon vegetation growth and enhanced microbial activity45. Organic matter alpha diversity initially increased at monsoon onset, likely due to precipitation-induced release of labile metabolites46. These metabolites were subsequently consumed, leading to decreased diversity and liability (low nominal oxidation state of carbon - NOSC) by monsoon’s end (Fig. 2F and Supplementary Fig. 2e).

Genomic features supporting resilience in arid soils

To understand the genomic basis of microbial stability in arid soils, we investigated the mechanisms underlying community-level responses. Our genome-centric approach recovered 282 MAGs (Fig. 3A), including 1 high-quality draft MAG and 281 medium-quality drafts based on the minimum information about metagenome-assembled genomes (MIMAG) standards47 (Supplementary Data 2). Despite representing ~10% of the total sequenced community (per CoverM), they captured ~60% of the sequenced microbial community at the genus level (as inferred by SingleM appraise at 86% sequence identity). This provides a robust foundation for exploring genomic adaptation in arid soils, even though these MAGs represent only a subset of the community and may exclude unassembled functional potential.

Fig. 3: Summary of genomic characteristics and stress tolerance traits of the recovered MAGs.
figure 3

A Phylogenetic tree of the 282 recovered metagenome-assembled genomes (MAGs) showing their genome size, completeness, and contamination inferred by CheckM2 as well as their changes in abundance throughout the sampling campaign. B Dot plot showing the distribution of the recovered MAGs based on their GC content and estimated maximum growth rate. C Heatmap showing the presence of stress resistance traits among the different taxa of the recovered MAGs.

Phylogenomic analysis revealed unexpected taxonomic breadth, with MAGs spanning 30 bacterial and one archaeal classes across 18 phyla (Supplementary Fig. 4). While this diversity highlights the complex microbial ecology of arid soils, it also suggests high endemism, only one MAG was classified at species level (average nucleotide identity, ANI > 95%), while most (n = 173) were classified at genus level, indicating potentially unclassified or undercharacterized taxa adapted to arid conditions (Supplementary Data 2). Analysis of the relative abundances of the recovered MAGs in both metagenomics and metatranscriptomics datasets (Supplementary Fig. 5a, b), showed that the Actinomycetota and Thermoproteota are among the most prevalent and active taxa, alongside Tectomicrobia.

Analysis of carbon fixation pathways revealed diverse metabolic capabilities distributed across the community. The Calvin–Benson–Banshan (CBB) cycle appeared in twelve bacterial genomes from Actinomycetota, Desulfobacterota, and Pseudomonadota, which encoded complete RuBisCO genes (K01601, K01602) and associated pathway components (Supplementary Fig. 6). While some steps of the reductive citrate (rTCA) cycle were widely distributed, only one MAG from Nitrospirota encoded the key enzyme. Similarly, the 3-hydroxypropionate/4-hydroxybutyrate (3HP/4HB) and dicarboxylate/4-hydroxybutyrate (DC/4HB) cycles showed restricted distribution, with key enzymes found only in Thermoproteota, Desulfobacterota, and Tectomicrobia.

Nitrogen cycling functions displayed comparable patterns of specialized and redundant distribution. Nitrification capacity was limited to Thermoproteota, specifically the ammonia-oxidizing archaea (AOA) order Nitrososphaerales. In contrast, denitrification and dissimilatory nitrate reduction were broadly distributed across taxa. Sulfate reduction pathways were mostly limited to assimilatory and dissimilatory functions (see Supplementary Note 2).

The distribution of carbon, nitrogen, and sulfur cycling pathways highlights how metagenomic potential supports environmental adaptation in arid soils. These functions exhibit contrasting patterns, some, like the Calvin–Benson–Banshan cycle and denitrification, show redundancy across multiple phyla, while others are specialized to specific taxa. This functional architecture reflects selective pressures in arid environments, where microbial communities have evolved both redundancy for essential functions and niche specialization. The community maintains genetic infrastructure for metabolic flexibility, enabling rapid pathway activation during environmental shifts. This genomic organization may be a key strategy through which microbial communities maintain critical ecosystem processes under stress5,8,48.

Genomic traits reveal convergent and specialized stress adaptations

At the level of individual MAGs, we investigated how specific traits and stress tolerance mechanisms support microbial survival to the environmental conditions of arid ecosystems. Most recovered MAGs (n = 183) exhibited high GC content and slow predicted growth rates (n = 143, maximum growth rate <0.2) (Fig. 3B), suggesting a common evolutionary response to harsh conditions49. High GC content likely enhances thermotolerance50, while slow growth rates indicate oligotrophic lifestyles51 aligning with previous findings of GC enrichment in arid conditions30 and resource-driven selection against AT base pairs in nutrient-limited soils52. Notably, Thermoproteota demonstrated an alternative strategy having low GC content, which may reflect genome streamlining53, and efficient nitrogen use54. However, while genome streamlining is typically associated with high coding density, our Nitrososphaera MAGs exhibited coding densities ranging from 67.6 to 87.2% (Supplementary Data 2), which are lower than those reported for other Nitrososphaera MAGs (73–85%)55. The lower values in our dataset may reflect incomplete or fragmented genome assemblies, as our MAGs fall within medium-quality thresholds. Alternatively, these patterns could indicate that desert-adapted Nitrososphaera retain a higher proportion of non-coding DNA, potentially related to regulatory flexibility or increased horizontal gene transfer capacity56, which may enhance survival under fluctuating environmental stressors.

Gene expression analysis of stress tolerance traits (Supplementary Data 3), revealed that protein stability mechanisms were widespread, with most taxa encoding and expressing genes for protection, repair, and degradation of denatured and misfolded proteins (Figs. 3C and 4), reflecting adaptation to high temperatures18. These included various heat shock proteins (HSP70, HSP40, HSP60, HSP10, HSP100, HSP24) critical for protein folding and protection57.

Fig. 4: Expression of stress tolerance genes during the 2021 monsoon season.
figure 4

Heatmap showing the levels of gene expression, agglomerated at phylum level, at different sampling times for various stress tolerance mechanisms defined in Supplementary Data 7. Gene expression was normalized as Gene length corrected trimmed mean of M-values (geTMM) and transformed into a z-score for each phyla and each stress tolerance mechanism to better represent the changes across the monsoon season.

Osmolyte production and transport showed both common and specialized patterns (Figs. 3C and  4). While trehalose and glutamate production were widespread, glycine betaine biosynthesis was restricted to select groups, and bacteria-exclusive ectoine was found only in Actinomycetota. These osmolytes help manage both high salinity and sudden water influx57.

Oxidative stress responses revealed similar diversity in adaptation strategies. Superoxide dismutase (K04564) showed widespread expression by most MAGs (n = 140) during dry months, particularly expression was higher in Entotheonella (geTMMMay = 478.8; geTMMJuly_1 = 263.47), Rubrobacteria (geTMMMay = 84.73; geTMMJuly_1 = 1.40) and Thermoleophilia (geTMMMay = 72.52; geTMMJuly_1 = 1.86) members (Fig. 4). Different taxa employed specialized catalase systems: heme-based catalases (KatG, KatE) in Acidobacteriota, Actinomycetota, Chloroflexota, Myxococcota and Pseudomonadota (KatG n = 41; KatE n = 8), and manganese catalase (K07217) in Thermoproteota (e.g., Nitrososphaera geTMMMay = 10.63; geTMMJuly_1 = 0.40), Gemmatimonadota and Nitrospirota (n = 41) with the highest expression during dry months (Fig. 4), suggesting specialized strategies for reactive oxygen species management58.

Multifunctional dps proteins (K04047) were present in several phyla (Fig. 3C), providing resistance against oxidative, metal, and thermal stress59, with the highest expression occurring during the dry months for Acidobacteriota, Actinomycetota, Chloroflexota, Gemmatimonadota, Tectomicrobia and Thermoproteota (e.g., Nitrososphaera geTMMMay = 81; geTMMJuly_1 = 3.32), and in the wet months for Pseudomonadota (Fig. 4). Furthermore, the genes for the glyoxylate bypass, suggested to play a role in oxidative stress60, were found encoded in many taxa and expressed during the dry months in Actinomycetota, Chroloflexota (e.g., UBA6077: geTMMMay = 3.46; geTMMJuly_1 = 1.74) and Tectomicrobia (e.g Entotheonella: geTMMMay = 13.88; geTMMJuly_1 = 8.47), and during the wet months in Acidobacteriota (e.g., Terriglobia: geTMMMay = 0.007; geTMMJuly_1 = 0.38), Desulfobacterota, Gemmatimonadota (e.g., Gemmatimonadota: geTMMMay = 0.15; geTMMJuly_1 = 2.78) and Pseudomonadota genomes (n = 48) (Fig. 4).

Other widely distributed stress tolerance mechanisms included the expression of genes relate to DNA repair as most taxa except for Thermoproteota encoded and expressed genes for the RecA (K03553) protein, which catalyzes one of the central steps of the DNA repair and homologous recombination pathway61 as well as polymerases such as polA (K02335) and polIV (K02346, K04479) that may play a role in double-strand DNA break repair62,63. Other strategies related with dormancy and persistence, such as sporulation, exopolysaccharide and lipopolysaccharide production, and polymeric carbon storage are explored in Supplementary Note 3.

Contrasting assembly processes reveal mechanisms of community-level resilience

While our observations demonstrate clear community-level responses to environmental change, the underlying ecological processes maintaining stability remain unclear. To investigate these mechanisms, we analyzed the ecological processes driving both community assembly and metabolite dynamics64,65,66,67. Our analysis revealed contrasting yet complementary mechanisms: stochastic processes predominantly shaped microbiome assemblages, while deterministic processes governed metabolite assemblages (Fig. 5A–C).

Fig. 5: Ecological assembly processes contributing to microbial communities and metabolites in arid soils influenced by monsoon.
figure 5

A, B Density plots showing changes in the βNTI of microbial community and metabolome in each month of sampling (n = 32) as well as bar plots indicating the percentage breakdowns of stochastic versus deterministic processes; βNTI was calculated from 16S rRNA ASVs and TWCD metabolites. Assembly processes can be delimited by the red dashed lines: variable selection (βNTI  >  2), homogenous selection (βNTI  <   −2), stochastic assembly (|βNTI|   <  2). C Bar plots showing the putative influence of different ecological processes within each month according to the microbial community, bulk metabolites, lipid-, sugar-, and TLC-like metabolites. D Heatmap representing the expression of microbial motility-related genes. Gene length corrected trimmed mean of M-values (geTMM) values for expressed motility-related genes were averaged per each MAG per month; the log10 is represented.

Microbial assembly processes transitioned over time. During the early monsoon (early July), we observed an increase in β-nearest taxon index (βNTI) values > +2, indicating a temporary shift toward deterministic assembly. This coincided with the first major rainfall events (Fig. 1C), suggesting that environmental filtering may have favored rain-responsive taxa. As the monsoon progressed (late July to early August), βNTI values decreased and fell within the stochastic range (|βNTI| < 2), indicating a return to stochastic dynamics that maintained compositional stability.

This transition also aligned with transcriptional changes. Specifically, we detected increased expression of genes associated with microbial motility, such as chemotaxis, flagellar biosynthesis, gliding motility, and pilus systems (Fig. 5D). For example, Myxococcota expressed both gliding and flagellar genes, while Actinobacteria, Chloroflexota, and Desulfobacterota upregulated other mobility pathways. Pseudomonadota showed elevated expression of pilus assembly genes, consistent with twitching motility. These transcriptional patterns suggest that microbial mobility may have supported dispersal and community homogenization, helping to maintain stability under fluctuating moisture conditions.

These shifts in assembly processes may reflect underlying ecological mechanisms that have been proposed in prior studies. For example, the sudden influx of water and nutrients during early monsoon events likely created selection pressures favoring wet-adapted taxa and benefiting tolerant or opportunistic microbial groups9,68. This environmental filtering may have intensified competition between rain-activated and dry-tolerant microbes35, contributing to the transient increase in deterministic assembly that we observed. As rainfall events became more frequent (late July–early August; Fig. 1C), stochastic processes emerged as the dominant force shaping community structure (Fig. 5C). This shift toward stochasticity could be driven by two mechanisms. Increased soil moisture during wetter periods may enhance pore connectivity and facilitate microbial dispersal6, while microbial dormancy during dry periods can buffer against environmental filtering by preserving a reservoir of resilient taxa14,15. These processes may help explain the compositional stability observed during the mid- to late-monsoon period.

In contrast, metabolite assemblages were consistently structured by deterministic processes throughout the monsoon cycle, similar to observations in peatland ecosystems67. βNTI values for metabolites remained outside the stochastic range (|βNTI| > 2), indicating the dominant role of environmental selection and chemical transformations in shaping metabolome composition (Fig. 5A–C).

We observed higher metabolome variability during wet periods, possibly reflecting increased inputs from plant exudates and litter69, as well as microbial byproducts70,71,72 (e.g., osmolytes, secondary metabolites, necromass). Likewise, drier periods were associated with higher relative abundance of sugar- and lipid-like compounds, which may reflect drought-adaptive metabolic strategies. These interpretations align with literature showing that sugar metabolites can function as osmolytes73, and lipid changes often indicate membrane remodeling under water stress74. Abiotic factors, including photodegradation and wind erosion, possibly further shaped organic matter dynamics and metabolome heterogeneity75.

Network restructuring reveals how individual traits shape community responses

To bridge individual adaptations and community-level properties, we analyzed both microbial co-occurrence networks and metabolic interaction networks. Using a random matrix theory (RMT)-based approach to build the co-occurrence networks76, we found that increased precipitation altered network architecture77,78. During dry periods, networks were larger and more modular (higher modularity, transitivity, and node/edge numbers, lower average path distance), whereas wet conditions produced smaller, more hierarchical networks (higher degree centralization, betweenness centralization, and stress centrality) (Fig. 6A and Supplementary Data 4). The shift in network architecture between dry and wet co-occurrence networks represents more than structural reorganization, it reveals fundamental changes in community interaction strategies.

Fig. 6: Microbial network analysis of the recovered MAGs.
figure 6

A Co-occurrence networks based on MAGs abundances (TMM) inferred with CoverM for the dry (May and October) and wet (July and August) months. Nodes represent MAGs and are colored based on their Phylum. Node size indicates the degree (number of connections) of each node. Colored ellipses indicate network modules with more than six nodes. B Social networks based on the metabolic interaction potential (MIP) score calculated with SMETANA. Only genome pairs with MIP ≥ 5 were considered for the network. Nodes represent MAGs and are colored based on their Phylum. Node size indicates the degree (number of connections) of each node. C Heatmap showing metabolite donation and uptake patterns between Network Hub microbes and members of other phyla.

During dry periods, the larger, more modular co-occurrence networks suggest a community-wide strategy for resource conservation. To test whether modules represent metabolically interdependent units, we analyzed metabolite sharing patterns within network modules using our SMETANA-derived metabolic interaction data based on the metabolite interaction potential (MIP) score79 (Supplementary Fig. 7a). Our analysis revealed three key findings: (1) Metabolite sharing is not enriched within modules compared to between modules when considering overall interaction frequency (Supplementary Fig. 7b). (2) Modules show distinct metabolic specialization patterns, with amino acid sharing concentrated in specific modules during dry conditions, while aromatic compounds and carboxylic acids showed a more even distribution across modules (Supplementary Fig. 7a). (3) More metabolites seem to be shared within modules during dry conditions (Supplementary Fig. 7b), with the number of MAGs pairs sharing metabolically important metabolites such as amino acids, carboxylic acids, fatty acids, lipids, and nitrogen cycle-related metabolites decreasing around 38% during wet conditions (Supplementary Fig. 7c). These patterns suggest that network modularity reflects functional specialization of metabolic processes rather than general metabolic complementarity. However, we acknowledge that habitat heterogeneity may also contribute to the observed modularity patterns. This organization stands in stark contrast to conventional understanding of drought responses, where networks typically become fragmented and less connected80,81.

Our findings overturn traditional expectations about water availability’s impact on microbial networks. Unlike other ecosystems where wet conditions promote higher interconnectedness82, dry conditions in our system fostered tighter community organization through modular, interconnected network structures. This tight organization likely reflects the need for efficient resource utilization83, where interdependence enhances environmental resilience. Conversely, wet conditions relaxed these tight resource constraints, allowing for more hierarchical organization with diffuse, opportunistic interactions82,84. These changes in structure suggest microbial interactions and responses play a role in community-level resilience. The higher centrality of wet co-occurrence networks suggests wet environments favor dominant species that quickly capitalize on increased nutrients85, while osmotic stress forces other microbes to regulate metabolism more intensively31. This network restructuring aligns with increased selection processes at monsoon onset (Fig. 5A, B), as water simultaneously drives community structure and metabolic activity changes86,87, while enhancing niche connectivity through dispersal mechanisms6.

Network analysis identified key species bridging individual traits and community function. In dry conditions, an Actinobacteria MAG (DNA-4_bin_15_1) served as a network connector, similar to findings from other deserts, including Namib88, Atacama89, and Sahara90. However, this MAG’s high abundance challenges the assumption that desert community stability relies on rare taxa91, suggesting abundant species can be central to ecosystem function.

Analysis of metabolic interaction potential using SMETANA79 (Fig. 6C) further revealed that dry networks showed more extensive metabolic interactions, indicating increased interdependence and active metabolite exchange83. Moreover, this analysis identified additional potential keystone MAGs (network hubs): Methylomirabilota (bco_bin_52) and Actinomycetota (DNA-4_bin_15_1) in dry conditions, and Acidobacteriota (bco_bin_18_1) during wet periods (Fig. 6B). The Methylomirabilota keystone has roles in greenhouse gas cycling92, soil processes93, and vegetation interactions94. SMETANA analysis in detailed mode (Fig. 6C), revealed that this MAG primarily donates amino acids, as previously reported95, while the Actinomycetota hub absorbs them demonstrating how individual metabolic capabilities shape community-level interactions and eventually resilience through energy-expensive amino acid exchange96.

Analysis of metatranscriptomics data confirmed some of these results, as complete biosynthetic pathways for valine, leucine and isoleucine were found to be present in several MAGs, including Methylomirabilota (Supplementary Fig. 8a), which were expressed through the monsoon season (Supplementary Fig. 8b). Additionally, expression of amino acid transporters was detected in a wide range of phyla (Supplementary Fig. 8c) indicating the ability of these microbes to retrieve amino acids from the environment. These insights into network restructuring show that shifts in microbial interactions and metabolic activity are closely linked to the community’s ability to resist and recover from disturbance. Changes in individual responses, such as the activation of stress mechanisms or the activation and deactivation of biosynthetic pathways, respond to community-wide strategies to either promote the collaboration or competition across community members. This coordination across different ecological organization levels may be key to sustaining ecosystem function under changing precipitation patterns.

Strategic gene regulation by Thermoproteota maintains critical ecosystem processes

To understand how individual taxa maintain specific functions within changing networks, we examined Thermoproteota as a model organism. We selected this phylum because: (1) eight MAGs from the Nitrososphaeraceae family represented the only aerobic ammonia oxidizers in our community, (2) nitrogen availability critically influences arid ecosystem function42,97, and (3) Thermoproteota’s genomic features and dynamic gene expression responses provide a unique model for understanding how individual adaptations support community stability.

Using genome-resolved metatranscriptomics, we tracked Thermoproteota’s gene regulation throughout the monsoon season (Supplementary Data 5). The ammonia oxidation genes amoABC (K10944, K10955, K10946) showed unexpected expression patterns, peaking during dry periods (Fig. 7A), with amoC (K10946) showing highest expression as previously described in ref. 55. This counterintuitive pattern reveals a sophisticated adaptation strategy. During dry periods, when most microbes reduce activity, Thermoproteota maintains elevated transcription of nitrification genes, as previously observed in hyperarid environments98. This activity likely reflects an ability to capitalize on reduced competition for limited nitrogen resources. The strategy is further supported by stress tolerance mechanisms, including specialized protein repair systems, manganese-based catalases for ROS management, and efficient nitrogen utilization. Together, these adaptations suggest that Thermoproteota may contribute to sustaining key ecosystem functions even when other taxa become dormant.

Fig. 7: Nitrogen metabolism in Thermoproteota MAGs and correlation with environmental factors.
figure 7

A Heatmap representing the expression of Nitrogen metabolism genes in Thermoproteota MAGs. Reaction-level expression, expressed as gene length corrected trimmed mean of M-values (geTMM) value, indicates that at least 50% of the genes of any of its alternative gene sets were expressed (see “Methods”), and is presented here averaged per MAG and per month. Expression was log10-transformed to improve visualization. B Dotplot showing correlations between amoABC gene complex expression and environmental variables (soil moisture and NDVI). C Heatmap of urease complex and transport gene expression (geTMM).

Environmental correlations further highlight how individual adaptations respond to community-level conditions. A strong negative correlation between amoABC expression and both soil moisture and NDVI (Fig. 7B and Supplementary Data 10) suggests that Thermoproteota sustains high nitrification activity under the driest conditions, potentially outcompeting other microbes for limited nitrogen. Moreover, four Thermoproteota MAGs showed high expression of the urease gene cluster (ureABCDEFG) during wet periods (Fig. 7C), indicating these taxa can metabolize both ammonia and urea as energy sources99. This versatility exemplifies how individual-level adaptations enable persistence across variable conditions, providing ecological advantages in nutrient-limited soils. A detailed overview of gene expression changes related to carbon, nitrogen, sulfur, and other metabolic pathways in Thermoproteota MAGs is provided in Supplementary Note 4.

The lack of reconstructed ammonia-oxidizing bacteria MAGs in our system further highlights how Thermoproteota’s strategy of regulated persistence enables their dominance of nitrification processes. Analysis of the taxonomic profiles of other arid soil datasets showcase the widespread presence of Thermoproteota in arid systems worldwide (Supplementary Fig. 2g). This global distribution underscores their importance in stabilizing nitrogen fluxes, a critical ecosystem function in arid environments that are particularly vulnerable to climate change-induced disruptions42, aligning with previous reports of archaeal dominance in arid soils42 and their broader distribution across oligotrophic environments100,101.

The potential ecological impact of Thermoproteota’s expression patterns extends beyond individual survival to influence community-level processes91. First, elevated transcription of ammonia oxidation genes during dry periods suggests that Thermoproteota may help sustain nitrogen cycling when other taxa reduce activity. Second, its network position shifts from peripheral during wet periods to central during dry periods, demonstrating how individual adaptations influence community organization.Third, its consistent expression of stress tolerance and nitrogen metabolism genes across conditions highlights how specialized taxa may contribute to functional resilience under environmental stress.

Discussion

In this study, we used an integrated time-resolved multiomics approach to explore how microbial communities respond to dramatic environmental shifts. Our results reveal how individual microbial stress responses and community-level adaptations interact to support ecosystem function under fluctuating conditions. Using Sonoran Desert soils as a model, we identified three mechanisms driving microbial resilience: (1) stochastic processes supporting stability, (2) dynamic network reorganization, and (3) coordinated stress responses. We also highlight Thermoproteota as a key taxon in arid soils, with potential as a model for studying adaptation to aridity.

First, regarding microbial network responses, our findings challenge the prevailing view that drought predominantly disrupts microbial networks, as often observed in analysis of microbial co-occurrence networks78,81,102. Instead, we observed larger and more interconnected microbial networks under dry conditions, aligning with evidence that stress can foster tighter microbial associations103. This tightly organized community structure became more diffuse during wet conditions, suggesting dynamic reorganization based on resource availability (Fig. 6A and Supplementary Data 4). These results may reflect unique adaptations of microbial communities in bare soils, where plant buffering effects are limited104,105.

We acknowledge that factors beyond precipitation, including soil microenvironmental heterogeneity, pH gradients, and nutrient patchiness, may also contribute to observed network patterns. Our single-site study design limits our ability to disentangle precipitation effects from other environmental variables, and future multi-site studies would help clarify these factors. However, our temporal sampling within the same location helps control for confounding variables, as major soil properties remain constant while precipitation patterns vary seasonally. The observed increase in network connectivity under drought supports the stress gradient hypothesis106,107,108. During dry periods, certain taxa, such as Thermoproteota and Actinomycetota, identified as network hubs due to being highly connected to other nodes (Zi >2.5) within their own module, as well as to several other modules (Pi >0.62) facilitate functional complementarity and efficient resource sharing among community members. Conversely, wet conditions promoted more centralized networks with increased negative correlations, indicating the rise of dominant microbes and heightened competition84,108. This dynamic network restructuring supports microbial diversity and ecosystem function, as evidenced by stable community composition across monsoon cycles and dominance of stochastic assembly processes (Fig. 5C). These patterns suggest that microbial communities may balance stability and adaptability through interaction strategies under fluctuating environments.

We propose that microbial network reorganization supports community resilience by coordinating responses at multiple biological levels. Tight, modular networks during dry periods enhance resource exchange and foster interdependencies that promote stability under stress79,80. This is consistent with the expression of amino-acid biosynthesis and transport systems across the monsoon season for cross-feeding. In contrast, diffuse and centralized networks during wet periods reflect increased metabolic flexibility and opportunistic interactions, enabling communities to rapidly exploit resource pulses while maintaining core functions13.

Despite limitations in metagenomics-derived network analyses, such as the inability to discern correlations, which may arise from indirect interactions or shared niches109 or that only a subset of the community may be represented, our approach demonstrates the utility of network analysis for inferring ecological functions80,81,110. While assembled MAGs represented only 10% of sequenced reads, as high strain heterogeneity may have prevented us to recover a larger number of MAGs47,111,112, our analysis showed that assembled MAGs captured 86% of the community’s taxonomic diversity. This taxonomic coverage captures key functional and ecological traits. For example, the prominent role of Thermoproteota during dry conditions (Fig. 7A) highlights its contribution to nitrogen cycling, a critical process for maintaining ecosystem function. Its widespread presence in arid environments globally (Supplementary Fig. 2g) indicates its ecological relevance for resilience mechanisms and highlights its potential as a biotechnological target. On the other hand network hubs support the community by facilitating the exchange of metabolically important metabolites such as amino acids or sugars (Fig. 6C). To validate their functional significance, we examined expression patterns of key amino acid biosynthesis and transporter genes in the hub MAGs. Hubs from the dry season network always express these biosynthetic pathways and transporters (Supplementary Fig. 8), with a notable increase following the start of the monsoon: Methylomirabilota (geTMMMay = 2.28, geTMMJuly_1 = 18.66) and Actinomycetota (geTMMMay = 1.22, geTMMJuly_1 = 8.944). In contrast, the wet season network hub has a delayed increase of its expression, peaking in August when the other hubs’ expression was starting to decline: Acidobacteriota (geTMMMay = 0.14, geTMMAugust = 19.24). These coordinated interactions generate reciprocal dependencies that optimize resource utilization, creating dynamic feedback mechanisms where individual metabolic capabilities influence community assembly patterns while established network associations shape individual metabolic investments and gene expression priorities. This bidirectional coupling is supported by widespread expression of amino acid biosynthetic pathways and transport systems across multiple taxa within network modules (Supplementary Fig. 8), demonstrating how individual metabolic capabilities play a role in shaping community structure and, in turn, are shaped by the broader microbial community79.

Complementing these network dynamics, the analysis of organic matter profiles reveals deterministic changes driven by both community-level metabolic adjustments and individual stress responses. Unlike plant-dominated ecosystems39,40, the prevalence of lipid- and protein-like compounds in these soils underscores the strong microbial influence on soil organic matter composition41. During dry periods, the accumulation of osmolytes and stress-related compounds reflects individual microbial adaptations113, while shifts in metabolite profiles during wet periods indicate community-level metabolic reorganization. These changes mirror network restructuring, as metabolite exchange relationships transition between tight and diffuse network arrangements, further illustrating the integration of microbial activities in maintaining ecosystem resilience.

At the individual organism level, our analysis revealed shared and specialized survival strategies among taxa, suggesting strong selective pressures for traits that support both individual survival and community function. Actinomycetota employed multiple adaptations including high GC, chaperone proteins114, spore formation115, osmolyte production116, and versatile metabolic strategies117. Similarly, Thermoproteota maintained function through stress response mechanisms including chaperones, DNA-repair proteins, and stress-related proteins like katG, katE, dps, and superoxide dismutases (Fig. 7A and Supplementary Fig. 6). The consistency of stress response mechanisms across diverse taxa suggests strong selective pressure for maintaining both individual survival and community function118. The widespread distribution of heat shock proteins, osmolyte production pathways, and oxidative stress responses indicates that these mechanisms represent essential adaptations to arid conditions14,102. These diverse survival strategies underscore the critical role of functional redundancy and metabolic versatility in maintaining community stability, demonstrating how alternative strategies can achieve similar community-level outcomes14,15. These individual MAG-level adaptations form the building blocks of network-mediated resilience, enabling communities to buffer environmental perturbations and sustain key ecosystem processes.

Together, these findings across multiple organizational levels provide a framework for understanding microbial resilience in arid environments. Stochastic processes support stable community structures, while dynamic network reorganization and MAG-level stress responses facilitate adaptation to extreme conditions (Fig. 8). The central role of Thermoproteota in nitrogen cycling suggests its potential as an ecological indicator of resilience. These strategies point to common mechanisms of community adaptation and resilience potentially broadly applicable across arid ecosystems, and potentially extend to non-arid environments facing environmental stress.

Fig. 8: Proposed dual-level resilience framework showing microbial responses to monsoon fluctuations in arid ecosystems.
figure 8

Resilience emerges from interactions between individual-level mechanisms (genomic traits, metabolic functions, gene regulation) and community-level processes (composition, assembly). These interactions drive network restructuring, enabling resource sharing and protection during dry periods, and resource exploitation during wet periods. Individual-level adaptations include high GC content, DNA repair mechanisms, and osmolyte production, while community-level responses involve both co-occurrence and metabolic interaction networks.

While these multi-level resilience mechanisms currently enable functional persistence through monsoon cycles, they also suggest vulnerabilities to future climate change. Increased aridity, prolonged droughts, or disruptions to stochastic processes may exceed the adaptive capacity of these microbial communities, leading to cascading effects and destabilization31,82,119,120. To safeguard critical ecosystem functions, future efforts should prioritize functional redundancy within microbial communities, conserving microbial diversity hotspots, and monitoring network reorganization thresholds. Additionally, integrating predictive modeling with multiomics datasets could facilitate early detection of resilience breakdowns under changing climate scenarios. Tracking key taxa such as Thermoproteota will be essential, providing critical data on how microbial communities navigate environmental extremes and informing strategies to safeguard ecosystem functionality.

Methods

Soil collection

Soil samples were collected at the Saguaro National Park West (Permit #: SAGU-2021-SCI-0006), Tucson, Arizona (32° 15 ‘05.4“N, 111° 09’ 40.2“W) from a depth of 10–15 cm using ethanol-sterilized shovels. Sample collection occurred across 9 time points spanning pre-monsoon (May 2021, 2022), monsoon (June–August 2021), and post-monsoon (September-October 2021) seasons, with 4 biological replicates (i.e., 4 sampling sites per month) per time point (total n = 36). Samples were transported on ice to the laboratory, and sequentially sieved through 4000 and 2000 µm Fieldmaster® sieves (Science First, Yulee, FL). Samples were stored at −80 °C until analysis.

The sampling site is located within the Sonoran Desert in the Southwestern USA. A region that experiences extreme temperatures exceeding 40 °C, with maximum temperatures reaching 48 °C121. It has an annual precipitation average of 200 mm (1980–2016), with regional variations across the desert landscape122. The area’s distinctive bimodal precipitation pattern includes winter frontal system rainfall and summer monsoon storms123, with a characteristic dry period in late spring.

Soil physicochemical properties

For soil moisture, around 5 g of soil were placed in 15 ml Eppendorf tubes (3 replicates per sample) and lyophilized on a FreeZone 2.5 l Benchtop Freeze Dryer (Labconco, Kansas City, MO) for 72 h. Moisture content (%) was calculated as: [(initial weight − lyophilized weight)/initial weight] × 100. For pH measurements, 2 g of soil were mixed with 6 ml double-distilled water, vortexed for 15 min, centrifuged for 3 min, and incubated at room temperature for 30 min. pH was measured twice per sample using a benchtop Orion Star A211 pH meter (Thermo Scientific, Waltham, MA).

Total carbon (TC) and nitrogen (TN) were measured for all samples (n = 36) using a ECS 4010 Elemental combustion analyzer (Costech Analytical Technologies, Valencia, California) by the Arizona Laboratory for Emerging Contaminants (ALEC) at the University of Arizona. Major cations (K, Ca, Na, Mg) and trace metals (V, Cr, Fe, Co, As, Se) were quantified using an Agilent 7700 inductively coupled plasma mass spectrometer (ICP-MS) (Agilent, Santa Clara, CA) as described before124 with the exception that the US EPA Method 3051A was used for soil acid digestion. All measurements are available in Supplementary Data 1.

Daily precipitation data, soil and air temperature measurements were retrieved from the U.S. Climate Reference Network Data website (https://www.ncei.noaa.gov/pub/data/uscrn/). Precipitation data of each sampling point was calculated as the aggregated precipitation of the 4 days previous to the sampling. Normalized difference vegetation index (NDVI) values were retrieved from the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Vegetation Indices (MOD13Q1) Version 6.1125 dataset. As the NDVI values are measured on a 16-day basis, the NDVI of each sampling point was determined by interpolation using a LOESS function fitted using data collected between 05/09/2021 and 05/28/2022.

For multivariate analysis of the environmental variables, variables were selected to avoid collinearity by calculating the variance inflation factor, using the performance package126, and removing those with variance inflation factor larger than 10. Remaining environmental variables were then standardized by z-score with the deconstand function (method = standardize) of the vegan package. Standardized values were used to produce a hierarchical clustering of the samples using Manhattan distances. Additionally, a PCA was also performed with this data to better visualize the differences between samples.

FTICR-MS sample preparation and data preprocessing

Ultra-high resolution soil organic matter profiling was conducted using Direct Infusion Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR-MS). Soil metabolites were extracted through sequential water and methanol extraction. A total of 2.5 g of each 2021 soil sample (n = 32) was mixed with 5 ml of ddH2O, followed by two rounds of vortexing and 2-h sonication. Samples were then centrifuged at 5000 rpm for 5 min. The supernatant was collected and stored at −20 °C. The remaining soil mix underwent two rounds extra of extraction with the same protocol using 5 ml of HPLC-grade methanol instead of ddH2O. Combined water and methanol extracts were purified using solid phase extraction (SPE) to remove contaminants127. Extracts were eluted using 1.5 mL of HPLC methanol, and the final purified extracts were shipped to the Pacific Northwest National Laboratory (PNNL) for mass spectrometry analysis.

High-resolution mass spectra data was collected with a 12-Tesla Bruker SolariX FTICR mass spectrometer (Bruker, SolariX, Billerica, MA) located at the Environmental Molecular Sciences Laboratory at PNNL in Richland, WA. Samples were directly infused into the instrument using a custom automated direct infusion cart that performed two offline blanks between each sample. The FTICR-MS was outfitted with a standard electrospray ionization (ESI) source, and data was acquired in negative and positive mode with the needle voltage set to +4.2 kV and −4.2 kV, respectively. Data was collected at 4 MW, and the ion accumulation time (IAT) was optimized for each polarity. In negative mode the IAT was set to 0.08 s and data was measured from 98.28 to 2000 m/z with a resolution of 300 K at 333.1118 m/z. In positive mode the IAT was set to 0.08 s and the data were measured from 147.42 to 1400 m/z with a resolution of 380 K at 385.1176 m/z. One hundred forty-four scans were co-added for each sample and internally calibrated using OM homologous series separated by 14 Da (–CH2 groups). The mass measurement accuracy was typically within 1 ppm for singly charged ions across a broad m/z range (100–900 m/z). Bruker Data Analysis (version 4.2) was used to convert raw spectra to a list of m/z values by applying FTMS peak picker module with a signal-to-noise ratio (S/N) threshold of 7 and absolute intensity threshold of 100. Chemical formulae were then assigned using Formularity128, with parameter: S/N > 7, and mass measurement error <0.5 ppm, allowing only for the presence of C, H, O, N, S, and P. We present results from positive mode analysis only, which revealed the strongest shifts in metabolite composition.

Thermodynamic indexes for the FTICR molecular formulae, chemodiversity indexes and putative biochemical transformations were calculated using MetaboDirect (version 1.0.7)129 keeping only peaks that were present in at least 4 samples and with the sum normalization method. Ordination analysis based on FTICR-MS data was done based on Jaccard distances using a presence/absence matrix of FTICR-MS peaks. The number of nitrogen and sulfur-containing transformations were inferred based on the assigned transformation names (Supplementary Data 6).

DNA extraction

DNA was extracted from 250 mg soil samples using the DNeasy PowerSoil Pro Kit (Qiagen, Hilden, Germany) according to manufacturer protocols. DNA quantification was performed using the Qubit DNA high-sensitivity kit (ThermoFisher Scientific, Waltham, MA). Separate DNA extractions were conducted for 16S amplicon sequencing and shotgun metagenomics. The University of Arizona Genetics Core (RRID: SCR_012429) performed all library preparation and sequencing data generation.

16S rRNA amplicon sequencing

Amplicon sequencing was conducted on samples collected from May to October 2021 (n = 32). The V3-V4 regions of the 16S rRNA gene were amplified using previously designed primers130 (Forward: 5’-CCTACGGGNGGCWGCAG-3’ and Reverse: 5’-GACTACHVGGGTATCTAATCC-3’), with Illumina™ overhang adapters at their 5′ ends. Amplicon amplification was performed using KAPA HiFi HotStart Ready Mix (Roche Sequencing Solutions, Pleasanton, CA) following manufacturer protocols. A secondary PCR utilized Nextera XT v2 Index kits sets A-D (Illumina, San Diego, CA). Amplicon purification employed HighPrep PCR magnetic beads (MagBio, Gaithersburg, MD) at 0.8X and 1.2X volumes for first and second PCR products, respectively. Libraries were quantified using QuantiFlor dsDNA System (Promega, Madison, WI) on a BioTek FLX 800 plate reader (BioTek, Charlotte, VT). Size determination of pooled amplicons used an AATI Fragment Analyzer with HS NGS Fragment kit (Agilent, Santa Clara, CA). Pools were diluted to <20 nM and quantified using a Roche LightCycler 480-II with KAPA Illumina Library Quantification Kit (Roche Sequencing Solutions, Pleasanton, CA). Sequencing was performed on an Illumina MiSeq platform using the v3-600 cycle kit (Illumina, San Diego, CA), generating 250 bp paired-end reads.

Metagenomics sequencing

Metagenomic libraries were prepared from samples collected in 2021 during May (Sampling time point 1; n = 4), July (Sampling time points 3, 4 and 5; n = 12), August (Sampling time point 6; n = 4), October (Sampling time point 8; n = 4), and May 2022 (Sampling time point 9; n = 4). Quality control involved genomic DNA (gDNA) quantification using QuantiFlor dsDNA System on a BioTek FLX 800 plate reader, and size and integrity assessment using AATI Fragment Analyzer with HS NGS Genomic DNA kit. Genomic DNA (350–550 ng) was sheared to 200 bp using standard conditions on an S2 sonicator (Covaris, Woburn, MA). Library preparation utilized NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) with NEBNext Multiplex Oligos for Illumina UDI primer pairs (New England Biolabs, Ipswich, MA), employing 3 PCR enrichment cycles. Libraries were purified using 0.8X volume HighPrep PCR magnetic beads, quantified with QuantiFlor dsDNA System on a BioTek FLX 800 plate reader, and sized using AATI Fragment Analyzer with HS NGS Fragment kit. Size-adjusted libraries were quantified using Roche LightCycler 480-II with KAPA Illumina Library Quantification Kit before equal pooling. Initial metagenomic sequencing generated 2 × 150 bp paired-end reads using one full lane of a NovaSeq 6000 S4 flow cell v1.5 (300 cycles) on the NovaSeq 6000 platform (Illumina, San Diego, CA). A second sequencing round was performed for six selected samples (one per time point) using another full lane of a NovaSeq 6000 S4 flow cell v1.5 (300 cycles).

RNA extraction and sequencing

Metatranscriptomic sequencing was conducted on the same samples used for metagenomic analysis (n = 28). RNA was extracted from 2 g soil samples using RNeasy PowerSoil Total RNA kit (Qiagen, Hilden, Germany) following manufacturer's protocols. DNA removal was performed using DNase I (New England Biolabs, Ipswich, MA), followed by additional purification using RNA Clean Kit T2030 (New England Biolabs, Ipswich, MA). Initial RNA quantification employed Qubit RNA high sensitivity kit (ThermoFisher Scientific, Waltham, MA), with RNA integrity assessed via 2100 Bioanalyzer (Agilent, Santa Clara, CA) before submission to the University of Arizona Genetics Core (RRID: SCR_012429) for library preparation and sequencing.

Prior to library preparation, secondary RNA quantification was performed using Qubit Fluorometer 2.0 with Qubit Broad Range RNA Quantification Kit (ThermoFisher Scientific, Waltham, MA), and RNA quality was evaluated using AATI Fragment Analyzer with 15 nt RNA Kit. RNA ribodepletion was conducted on 50–500 ng input RNA using NEBnext rRNA Depletion Kit (Bacteria) (New England Biolabs, Ipswich, MA). Libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) and NEBnext Multiplex UDI Primer Pairs for Illumina (New England Biolabs, Ipswich, MA), with 9–14 PCR enrichment cycles. Double purification of libraries used 0.8X volume HighPrep PCR magnetic beads. RNA library quantification and sequencing followed the same protocols described for DNA libraries.

Comparison with other datasets

Metagenomics read signatures were compared with shotgun metagenomics reads available in public databases from other arid ecosystems across the world including the Negev Desert (PRJEB36534)14 and (PRJNA657906)131, Sinai Desert (PRJNA810925)132, Xilingol League (PRJNA916612)133, Mojave Desert (PRJNA620793)134, Cuatro Cienagas (PRJNA857298)135, as well as our same study ecosystem, the Sonoran Desert (PRJNA664514)136, using MASH distances137. Microbial taxonomic profiles of all datasets were inferred with SingleM (v0.18.2)138.

Microbial community profiling and analysis

Microbial community profiles based on ASVs were generated using 16S rRNA gene amplicon sequencing reads with the DADA2 (v1.26) pipeline139. Briefly, adapter trimming was performed using cutadapt140, trimmed reads were then filtered and quality trimmed to a minimum length of 50 and maximum expected error of 2. Error rates were learned directly from processed reads using DADA2 sample inference algorithm to infer sequence variants. Fully denoised sequences were generated by merging ASVs from forward and reverse reads followed by chimeric sequence removal. Taxonomy was assigned using a naive Bayesian classifier141 with Genome Taxonomy Database (GTDB) version r220142.

To account for 16S rRNA gene copy number variation effects on community analysis143, metagenomic-based taxonomic profiling was also performed using SingleM (v0.18.2)138 on metagenomic raw reads following tool authors’ recommendations. Alpha diversity indexes were calculated as the average value of the indexes obtained from each marker gene. For other downstream analyses we used the rplP gene (encoding ribosomal protein L16 L10e), due to its effectiveness in distinguishing between closely and distantly related genomes144.

Statistical analyses were conducted in R (v4.3.2)145. The 16S and SingleM datasets were rarefied to 25,000 and 800 reads, respectively, for alpha and beta diversity calculations. Other analyses using these datasets were made using the unrarefied values. Observed and estimated richness (Shannon diversity index and Chao1) were calculated with the phyloseq R package146, while beta diversity analysis was done with Bray Curtis distances calculated with the vegan package147. Principal coordinates analysis (PCoA) was performed using phyloseq, with PERMANOVA testing for differences between sampling time points.

For phylogenetic analysis of 16S rRNA data, multiple sequence alignment was done using the DECIPHER package (version 2.26.0)148. A maximum-likelihood phylogenetic tree was constructed using FastTree (version 2.1.1)149, with a generalized time-reversible model and gamma option for branch length rescaling and Gamma20-based likelihood computation. The tree was midpoint-rooted using the phytools package (version 2.3.0)150. Compositional stability was quantified by calculating a multi-order compositional stability index103, between each pair of samples collected from the same subsite at two adjacent time points.

Metagenome processing and assembly

Sequencing generated 44–180 million read pairs during the first round and 416–588 million read pairs during the second round per sample. Read quality assessment was performed using FastQC (v0.11.9)151. Raw reads underwent preprocessing using the RQCFilter pipeline (BBtools version 38.90)152. Cleaned reads were assembled with MEGAHIT (v1.2.9)153 with the “--presets meta-large” option as recommended for high-biodiversity soil metagenomes. To optimize metagenome assembled genome (MAG) recovery, eight metagenome assemblies were generated using two strategies: (1) Co-assembly of samples from the same sampling point (6 assemblies: May, July_2, July_3, August, October, May_2022) and (2) Co-assembly of samples from the same sequencing run (2 assemblies: one with all samples from the first sequencing run (28 samples), other with all samples from the second sequencing run (6 samples), see Metagenomics sequencing section). Assemblies were then filtered to retain only contigs larger than 2000 bp. In total we retrieved between 0.3 and 1.5 million contigs from each assembly with a N50 between 3200 and 3500 bp. Read mapping to the filtered assemblies using CoverM (version 0.7.0)154, with flags -p minimap2-sr -m trimmed_mean --min-read-percent-identity 0.95 --min-read-aligned-percent 0.75, indicated they accounted for 29.74–42.89% of the reads of each sample.

Recovery of metagenome assembled genomes

For MAG recovery, filtered contigs of each assembly were binned separately using CONCOCT (v1.0.0)155, MetaBAT2 (v2.12.1)156, and MaxBin2 (v2.2.6)157 with default options. The resulting bin sets were refined using the bin refinement module from metaWRAP (v1.3.0)158, with parameters -c 50 and -x 10. Refined bins were manually curated using anvi’o (version 8)159, based on sequence coverage and the presence of single-copy core genes identified with anvi-run-hmms. Bin quality and completeness were checked using CheckM2 (v1.0.2)160. Only bins with more than 50% completeness and less than 10% contamination (medium quality draft genomes) were retained and used for the analysis resulting in a set of 661 bins. Bins generated from the different assemblies were dereplicated at 95% average nucleotide identity similarity (species-level161) using galah (v0.4.0)162 resulting in 282 dereplicated MAGs. All downstream analyses using both the metagenomics and metratranscriptomics data refer exclusively to this dereplicated MAG set. Read mapping using CoverM, with the same parameters as before, revealed that the dereplicated MAGs accounted for 3.40–9.11% of the reads from each sample, comparable to other metagenomics studies of soil microbial communities (2.7–22.4%)111. This resulted in mean coverage depths for each MAG ranging from 0.9X to 100X per sample, with over 50% of MAGs exceeding 3X average coverage across all samples. This showcases how the multilayered co-assembly strategy supported confident genome binning, robust taxonomic classification, and functional potential profiling that would not have been possible otherwise. To determine how much of the microbial communities was represented by the recovered MAGs, we compared the taxonomic profiles of the reads, the assemblies, and the recovered MAGs using SingleM appraise at 86% sequence identity (genus level) and the –imperfect flag.

Dereplicated MAGs were taxonomically classified using the Genome Database Taxonomy toolkit (GTDB-tk) (v2.4.0)163 and the R09-RS220 database. Phylogenetic trees for bacterial and archaeal species were built separately using the de novo workflow of GTDB-tk based on 120 marker genes for bacteria and 53 marker genes for archaea. Trees were rooted using the p__Patescibacteria and p__Altiarcheota for the bacteria and the archaeal tree respectively. Visualization of the trees was performed using the ggtree164 and ggtreeExtra165 R packages.

Functional annotation of the dereplicated MAGs was performed using the automated DRAM pipeline166. Additional databases used in the same DRAM run included the NCycDB167, SCycDB168, and PCycDB169. KEGG annotations were integrated with annotations from NCycDB, SCycDB, and PCycDB based on KEGG Orthology (KO) numbers. This integration enabled differentiation between closely related orthologs from KEGG databases (e.g., K19044 encoding either amoA or pmoA) using metabolism-specific database results.

MAG stress tolerance trait identification was based on literature-derived definitions, KEGG database annotations170, and modified microtrait definitions171 as detailed in Supplementary Data 7. Traits were considered present in a MAG when at least 50% of trait-defining genes were detected. Functional pathway analysis for carbon, nitrogen, and sulfur metabolism in MAGs was conducted at both reaction (Supplementary Data 8) and pathway levels (Supplementary Data 9). Reaction and pathway definitions, including key enzymes, were sourced from the KEGG database and literature review. A reaction was considered present in a MAG when at least 50% of the required genes were detected. Similarly, pathways were considered present when a MAG contained at least 50% of pathway reactions plus required key enzymes.

Maximal growth rate for each of the MAGs was predicted based on genome-wide codon usage statistics using the gRodon2172 R package. Briefly, minimal doubling time was predicted with the predictGrowth function using mode=partial to account for the medium-quality draft nature of the dereplicated MAGs. Maximal growth rate was calculated as 1/minimal doubling time.

Metatranscriptomics processing

The quality of metatranscriptomics reads was evaluated with FastQC151. Raw reads were processed with the RQCFilter pipeline (BBtools version 38.90)152 to trim adapters and filter contaminants (parameters: removehuman = t removecat = t, removedog = t removemouse = t lambda = t). Further filtering of risobomal RNA was performed using SortMeRNA (version 4.3.6)173. Bowtie2 (v2.5.4)174 was used to map the filtered and trimmed reads against the dereplicated MAGs. The resulting SAM files were sorted using samtools175. Mapped reads counts were obtained at the transcript level and summarized based on gene_id (parameters: -t CDS -g gene_id) using featureCounts (version 2.0.6)176. Only transcripts with read counts higher than 5 and present in at least 3 samples were kept for downstream analysis. Read counts were normalized following the gene length corrected trimmed mean of M-values (geTMM) method177.

A MAG was considered to express a metabolic pathway based on KEGG definitions as follows:

  1. 1.

    A reaction was considered expressed if at least 50% of the genes of any of its alternative gene sets were expressed (KOs forming a gene set are defined by plus sign in the definition, alternative gene sets are separated by commas).

  2. 2.

    A module was considered expressed if at least 50% of its reactions were expressed. This 50% cutoff was selected to account for the incompleteness of MAGs, which are often neither fully assembled nor circularized, resulting in partial metabolic pathways.

Microbial and metabolite community assembly analysis

Microbial phylogenetic signal

Microbial community assembly shifts during the monsoon season were analyzed using ecological null modeling64,65, based on the assumption that phylogenetically related ASVs share ecological similarities (phylogenetic signal)65. Phylogenetic signal analysis encompassed 19 environmental variables (detailed in Supplementary Data 1), using the between-ASV difference in environmental optima and between-ASV phylogenetic distance.

Environmental optima were calculated using abundance-weighted mean values for each environmental variable using the analog package (version 0.17.6)178. Between-ASV environmental optima differences were computed using Euclidean distance for all environmental variables. Between-OTU phylogenetic distances were derived from the microbial phylogenetic tree using the adephylo package (version 1.1.16)179. Phylogenetic signal assessment was done using a Mantel correlogram based on a Pearson’s correlation coefficient across 50 distance classes with 999 permutations, applying progressive Holm-Bonferroni correction for multiple testing.

Metabolite dendrogram construction

To analyze monsoon-mediated shifts in metabolite assembly processes in arid soils, we applied ecological null modeling methods originally developed for microbial data64,65, and then adapted for the analysis of metabolites66,67. This framework allows us to infer whether shifts in community composition (microbial or metabolite) are primarily shaped by consistent environmental filters, random processes, or spatial dispersal dynamics, key to understanding how monsoon regimes modulate soil biogeochemistry in arid ecosystems.

Metabolites were categorized into three groups based on molecular class assignments180: group 1: lipid-like, group 2: carbohydrate-, protein- and amino sugar- like compounds referred to as sugars and group 3: tannin-, lignin- and condensed hydrocarbon-like compounds referred to as TLC.

Relational dendrograms were constructed following the methodology described by ref. 66, for both bulk metabolite data and the three metabolite pools using three approaches (1) metabolite molecular characteristics (MCD) based on metabolite structural properties (e.g., elemental composition, double-bond equivalents, modified aromaticity index, and Kendrick’s defect), (2) metabolite potential biochemical transformations (TD) based on putative chemical conversions derived from ultrahigh mass resolution differences between metabolites enabled with FTICR MS data, and (3) metabolite transformation-weighted characteristics (TWCD) integrating both MCD and TD dendrograms. Dendrogram construction utilized binary presence/absence values to eliminate charge competition bias in abundance estimates66.

Microbial and metabolite β-diversity analysis and ecological null modeling

Ecological processes influencing microbial and metabolome assemblages during monsoon periods were estimated using βNTI and Raup-Crick Bray-Curtis index (RCBC), analyzing the microbial phylogenetic tree and TWCD metabolite-derived dendrogram as previously described64,65,66. Briefly, microbial and metabolite β-mean nearest taxon distance (βMNTD) was calculated using the comdist function from the picante package (v1.8.2) and compared to a null distribution generated from 1000 randomizations. The βNTI was then computed by standardizing the observed βMNTD against the null expectation, following the approach described in ref. 64.

Following this approach, |βNTI| > 2 indicates predominance of deterministic processes, such as environmental abiotic factors and biotic interactions influence microbial community turnover and differences in production and degradation rates of metabolites. Meanwhile, |βNTI| < 2 indicates that stochastic processes such as random processes of birth, death, colonization, and speciation of microbial communities181 cause changes in species diversity and composition. Similarly, metabolite shifts caused by physical forces or vector movements are considered as stochastic changes66. Moreover, when βNTI > 2, variable selection indicates that divergent environmental factors drive high compositional turnover between a pair of communities. Conversely, when βNTI < −2, homogeneous selection indicates consistent selective pressures from stable environmental conditions are the primary cause of low compositional turnover between local communities64.

Furthermore, the RCBC turnover index was used to determine the influence of stochastic ecological processes. Briefly, the observed presence/absence-based Bray-Curtis values from pairwise community comparisons were calculated and compared to a null expectation (generated through 1000 randomizations). Deviations of observed values from the null comparisons were then normalized between +1 and −1 to produce the RCBC metric. An RCBC > 0.95 with |βNTI| < 2 indicates that higher-than-expected compositional differences between communities (or metabolomes) are primarily due to dispersal limitation, allowing ecological drift. In contrast, an RCBC < −0.95 with |βNTI| < 2 suggests that lower-than-expected compositional differences are driven mainly by homogenizing dispersal. Finally, when |RCBC | <0.95 and |βNTI| < 2, compositional turnover between communities (or metabolomes) is not predominantly influenced by selection, dispersal, or ecological drift, a condition referred to as undominated65.

Co-occurrence network analysis

MAG interactions were analyzed through co-occurrence microbial networks using a RMT-based approach76. Networks were constructed using the Molecular Ecological Network Analysis pipeline (MENA)76,182 website (http://ieg4.rccc.ou.edu/mena/login.cgi). Two separate networks were constructed, one using data from sampling points from two dry months (May and October) and another with data from two wet sampling points (July and August). Metagenome-assembled genomes abundances expressed as trimmed means of M-values (TMM) were obtained using CoverM. Separate abundance tables for each network were uploaded and processed using the MENA pipeline. Briefly, a similarity matrix was built using Spearman correlations of features present in at least half of the samples. An association threshold was then determined using the RMT approach182. Network topological parameters were calculated with the MENA pipeline. Module determination employed a greedy modularity optimization algorithm183. Node topological roles were assigned based on within-module connectivity (Zi) and participation coefficient (Pi)76. Networks were visualized using the igraph184 and ggnetwork185 R packages.

Metabolic interaction network

Genome scale metabolic (GEM) draft models were reconstructed for each MAG using CarveMe186, with distinct templates for bacterial and archaeal MAGs. Models were constructed without gap filling to avoid false-positive cross-feeding metabolic interaction predictions187.

To better understand metabolic interactions among the microorganisms in each of the co-occurrence networks, metabolic interactions were assessed using SMETANA79. SMETANA was run in global mode for each possible pair of MAGs within each of the two co-occurrence networks to calculate a metabolic interaction potential (MIP) score between each pair. Only pairs with MIP scores ≥5 were considered highly interacting and were used for building the metabolic interaction network188. The cross-feeding potential between pairs of highly interacting MAGs was then investigated using the detailed mode of SMETANA, keeping only those metabolites with a SMETANA score higher or equal than 0.1, as well as those related with nitrogen and sulfur metabolism. Node topological roles were assigned based on within-module connectivity (Zi) and participation coefficient (Pi)76. Networks were visualized using the igraph and ggnetwork R packages.