Introduction

Geothermal aquifers host permeable layers of fluid-bearing rock, storing the heat from the Earth’s interior1. The high permeability allows the continuous circulation of heat and fluids, along with the formation of temperature and redox gradients2,3. Geothermal environments are essential for the emission of deep Earth compounds, such as H2, N2, sulfur vapor, CO, and NH3, thus providing energy and carbon sources for aquatic microorganisms and contributing to biogeochemical cycling4,5. The physical, chemical, and geological settings determine the taxonomic diversity and metabolic activity of microbial communities, which are the basis of geothermal ecology6. Being spread across the globe7,8,9geothermal hot springs are referred to as both hotspots of microbial diversity and “less abundant” in biodiversity10,11. Microbial biomass and diversity are less abundant herein than in other aquatic environments following extremely limiting conditions of temperatures, light, and pH12,13,14. Notably, the study of several thermophilic and hyperthermophilic microorganisms, adapted to thrive under such extreme conditions, can help in understanding the evolution of life on Earth and beyond15,16,17.

Over the past two decades, research has significantly enhanced our understanding of microbial diversity and function within aquatic ecosystems. Groundwater microbial communities resulted to be primarily composed of few key taxa such as Patescibacteriota, Nitrospirota, and Pseudomonadota, mainly represented by genera Nitrospira, Flavobacterium, Pseudomonas, and Rhodoferax18,19,20,21. Additionally, Actinobacteriota, Bacteroidota, and Bacillota were also were also identified among the microbial taxa present in the groundwater22,23,24,25. Geothermal areas have been deeply investigated, and ubiquitous microorganisms identified26. In particular, the most commonly detected taxa included the cyanobacteria Synechococcus, the photosynthetic bacteria Chloroflexus, the sulfur-related chemolithotrophs Acidithiobacillus and Sulfolobus, the heterotrophic bacteria Thermus, and the archaea Thermoplasmata26,27.

The complex structure and dynamics of geothermal ecosystems may play an active role in selecting microbial communities and affecting their spatial distribution and activity28. Indeed, unique environmental conditions such as T (high temperature favors thermophilic microorganisms), pH (geothermal waters can have acid or alkaline pH), nutrient availability (the presence/absence of organic matter and inorganic compounds), and mixing with different water sources (the introduction of mesophilic and/or non-thermophilic may influence interactions among taxa) consistently affected the groundwater microbial consortia29,30,31. The discharge from hot springs can flow into nearby freshwater environments (e.g., streams, ponds, lakes), also replenishing local groundwater32. The mixing of thermal and non-thermal waters can lead to relevant physical-chemical modifications33. Hydrothermal systems contain high levels of chemical elements generally occurring at only trace levels in non-thermal waters (e.g., As, U, V, NO3-)34. The geogenic water contamination, as in the case of arsenic (As), resulting from the mixture of thermal water with groundwater or surface water, can deteriorate the quality of water resources and their suitability for human use35. This process is one of the most critical environmental factors affecting the quality of water for human consumption and recreational activities34,36.

While the influence of water mixing on local geochemical processes in geothermal aquifers is documented, our understanding of the ecological functions and spatial dynamics of microbial communities within aquifer systems is still developing, revealing their potential role in mediating biogeochemical transformations37,38. A deeper understanding of microbial diversity, metabolic potential, and ecological interactions in groundwater mixing zones is crucial to assess their influence on biogeochemical cycles (i.e., sulfur, nitrogen, and carbon), and to predict the ecological consequences of water mixing on ecosystem functioning. After two previous studies performed in the same area39,40we selected fifteen different freshwater sources, differently influenced by the rising of As-rich thermal waters (50–64 °C) from a deep aquifer consisting of Mesozoic sedimentary rocks, locally uplifted, fractured, and faulted41.This previous research39,40 shed light on the structure of microbial communities in As-rich geothermal waters, highlighting their tolerance to high As levels and pointing to the need for further research on thermophiles adapted to such conditions. The present study aimed to (i) unravel the microbial taxonomic diversity and metabolic potentialities in thermal waters and groundwaters, and (ii) identify microbial indicators of water mixing. We hypothesized that specific microbial taxa or potential functional properties will be enriched or depleted along with water mixing, allowing for the identification of areas where thermal waters and groundwaters may closely interact.

Methods

Study site

Water samples were collected from the Cimino-Vico volcanic area (Fig. S1), close to the city of Viterbo in the Latium region (Central Italy). This area is a complex geothermal site, characterized by several perched aquifers and a continuous basal aquifer flowing through volcanites41. In this volcanic region, widespread hydrothermal circulation was reported as underlined by the occurrence of geothermal fields42,43. The natural occurrence of As is explained by the complexity of the hydrostratigraphy, the structural setting of the area and the related mixing occurring between water circulating in the basal volcanic aquifer and the fluids that rise from depth through generated fractures and faults in Upper Cretaceous-Oligocene flysch due to volcanic basement uplift, all of which characterize the active geothermal system44,45,46. The sampling survey was aimed at collecting eight thermal waters (T1: waters in a pool fed by a hot spring cooled down to ambient temperature; T2, T3, T4, T5: waters from hot pools; T6, T7, T8: waters from three hot springs) and seven groundwaters (G1- G7) with a depth ranging between 27 m and 240 m.

Physical-chemical analyses

Temperature (T), pH, electrical conductivity (EC), and dissolved oxygen (DO) were measured on-site by field probes (Hach HQ 40d). Water samples (50 mL) were directly stored at 4 °C for anion analysis by ion chromatography (Dionex DX-120); 50 mL of water were in situ filtered through 0.45 μm cellulose acetate membrane filters (Whatman), acidified to 2% HNO3 with Suprapur acid (65%, Sigma-Aldrich), and stored at 4 °C for cation analysis by ICP-MS equipped with Octapole Reaction System (ORS) (Agilent 7500c). Arsenic speciation was assessed by hydride generation-absorption spectrometry (HG-AAS, Perkin Elmer AAnalyst 800). Arsenite determination was carried out using HCl 2% as carrier and reduction to arsine gas was performed with NaBH4 0.4% in acetate buffered samples at pH 4– 4.5. Astot was analyzed by HG-AAS before As(V) pre-reduction to As(III) by 5% KI/Acid Ascorbic solution. As(V) concentration was obtained by the difference (details in Casentini et al.47.

Microbial cell abundance by flow cytometry

Groundwater samples were fixed in formaldehyde solution (1% vol/vol final concentration) and kept at 4 °C for a maximum of 24 h. The flow cytometer A50-micro (Apogee Flow System, Hertfordshire, England), equipped with a solid-state laser set at 20 mV and tuned to an excitation wavelength of 488 nm, was used to count and characterize microbial cells in fixed samples. The volumetric absolute cell counting was carried out on samples stained with SYBR Green I (1:10,000 dilution; Molecular Probes, Invitrogen). Apogee Histogram Software (v89.0) was used to plot and analyze data; the light scattering signals (forward and side scatters) and the green fluorescence (530/30 nm) were considered for the single-cell characterization. Thresholding was set on the green channel. Voltages were adjusted to place the background and instrumental noise below the first decade of green fluorescence. Samples were run at low flow rates to keep the number of events below 1000 events s−1. The intensity of green fluorescence emitted by SYBR-positive cells allowed for the discrimination among cell groups exhibiting two different nucleic acid content (cells with Low Nucleic Acid content - LNA; cells with High Nucleic Acid content – HNA)48.

DNA extraction

An aliquot of each water sample (750–1000 mL) was filtered through polycarbonate membranes (pore size 0.2 μm, 47 mm diameter, Nuclepore) and immediately stored at − 20 °C until DNA extraction was performed with a DNeasy PowerSoil Pro Kit (QIAGEN, Italy) following the manufacturer’s protocol. The extracted DNA was eluted in 100 µL of nuclease-free water. The final quality and quantity of DNA were validated by 1% agarose gel electrophoresis, Nanodrop 3300 (ThermoScientific, Monza, Italy), and Qubit 3.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, United States). DNA was stored at − 20 °C in small aliquots.

16 S rRNA amplicon sequencing and bioinformatics

The DNA extracts were used as a template for the high-throughput sequencing of the 16 S rRNA gene following the library preparation and protocol reported in49,50. The primer pair 27 F (5′-AGAGTTTGATCCTGGCTCAG-3′) and 534R (5′-ATTACCGCGGCTGCTGG-3′) was used for the amplification of V1-V3 region of the bacterial 16 S rRNA gene. The primer pair 340 F (5′-CCCTAHGGGGYGCASCA‐3) and 915R (5′‐GWGCYCCCCCGYCAATTC‐3′) was used for targeting the region V3‐V5 of the archaeal 16 S rRNA gene. The Qubit 3.0 Fluorometer (Thermo Fisher Scientific, United States) was used for the quantification of library concentration. The purified libraries were pooled in equimolar concentrations and diluted to 4 nM. The samples were paired-end sequenced (2 bp × 301 bp) on a MiSeq platform (Illumina, United States of America) using a MiSeq Reagent kit v3, 600 cycles (Illumina, United States of America) following the standard guidelines for preparing and loading samples, with 20% Phix control library.

The raw sequences were quality checked by using fastqc51 and then processed and analysed using QIIME2 v. 2018.252. The demux (https://github.com/qiime2/q2-demux 10/02/ 2018) and cutadapt (https://github.com/qiime2/q2-cutadapt 02/ 12/2017) plugins were used to demultiplex reads and to remove primer sequences. The obtained reads were denoised, dereplicated, and chimera-filtered using DADA2 pipeline and amplicon sequence variants (ASVs) were identified53,54. The reads were subsampled and rarefied at the same number for each sample by using the feature-table rarefy plugin55. The taxonomy was assigned to ASVs using a pre-trained naïve-bayes classifier based on the 16 S rRNA at a 99% similarity of the SILVA132 release56. The dataset is available under accession number PRJNA1063038.

Functional profile prediction

The metagenomics potential of bacterial communities in thermal waters and groundwaters was predicted using 16 S rRNA sequencing data (ASVs representative sequences and abundance) via PICRUSt2 (https://github.com/picrust/picrust2) with default parameters57. Functional gene prediction was obtained from the Kyoto Encyclopaedia of Genes and Genomes-Orthology (KO). Information on metabolic pathways and Enzyme Commission (EC) numbers involved in the S-, As, N-, and CH4-related biogeochemical cycles was manually categorized based on the KEGG databases.

Metagenome sequencing and a Genome-Centric analysis

A small aliquot (30 µL) of DNA extracted from the hot pool (T5) was sent to DNASense laboratories (Aalborg, Denmark) for shotgun metagenomics analysis. DNA concentration and quality were evaluated using the Qubit dsDNA HS kit and TapeStation with the Genomic ScreenTape (Agilent Technologies, Milan, Italy), respectively. The sequencing library was prepared using the NEB Next Ultra II DNA library prep kit for Illumina (New England Biolabs, Beverly, MA, USA) following the manufacturer’s protocol. Library concentration was measured in triplicate using the Qubit dsDNA HS kit, and library size was estimated using TapeStation with D1000 HS ScreenTape by following manufacturer’s instructions. The sequencing libraries were pooled in equimolar concentrations and diluted to 4 nM. The sample was paired-end sequenced (2 × 301 bp) on a MiSeq (Illumina, San Diego, CA, USA) using a MiSeq Reagent kit v3 with 600 cycles (Illumina, San Diego, CA, USA), following the standard guidelines for preparing and loading samples on the MiSeq.

Raw Illumina reads were filtered for PhiX using Usearch1158 and subsequently trimmed using Cutadapt v. 2.1059. Forward and reverse reads were used to perform de novo assembly in megahit v. 1.2.9 according to the protocol reported in Sereika et al.60,61. Bins were subsequently extracted in mmgenome2 v. 2.1.3 and bins were quality-assessed with CheckM v. 1.1.3 by following the protocols previously described62,63. A classification of bacterial bins was performed with the Genome Taxonomy Database toolkit (GTDB-TK) v. 1.3.064. Genome annotations of bacterial and archaeal genomes were conducted with Prokka v. 1.14.665. The joint reads were also annotated according to the COG database to perform a more detailed analysis of the functional genes66. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession PRJNA746793.

Statistical analysis

The principal component analysis (PCA), based on the correlation matrix, was performed by comprising the chemical composition of thermal waters and groundwaters. Similarity matrices of bacterial community composition and metagenome prediction were calculated by applying the relative abundance-based Bray-Curtis index. A non-metric Multi-Dimensional Scaling ordination plot (nMDS) was used to visualize the variation patterns of the main families (≥ 1% of total reads) and predicted KO related to S-, As-, N-, CH4-related biogeochemical cycles. The correspondence analysis was performed separately, comprising the relative abundances of the microbial taxa and predicted KO observed in thermal waters and groundwater. Since the assumptions of normality and homogeneity of variance were not met for all variables, the non-parametric multivariate analysis of variance (PERMANOVA) and the analysis of similarities (ANOSIM) were used to test the differences in chemical and microbiological composition between thermal waters and groundwater. Statistical analyses were performed by using the PAST software package (Palaeontological STatistics, ver. 4.04)67.

The ASVs table obtained by 16 S rRNA gene bioinformatics processing was filtered by considering only ASVs ≥ 20 reads in at least one sample. Among those, the ASVs shared between groundwater (GW) and thermal waters (TW) were identified as the core microbiome.

Results and discussion

Water chemistry

TW were characterized by high temperature (up to 59.3 °C), high EC (up to 5680 µS/cm), and DO ranging from 0.2 to 1.8 mg/L (Table 1). Among thermal waters, T1 showed lower temperature (12.7 °C) and EC (2040 µS/cm) because it was collected into a pool at the end of a channel coming from a hot spring (T2-T3) where waters cooled down to ambient temperature68. In contrast, GW were characterized by a temperature range between 15.5 and 24.9 °C, an EC range between 250 and 724 µS/cm, and DO on average 4.7 mg/L.

Table 1 Geographical coordinates and physical-chemical characteristics of thermal waters and groundwaters.

As expected according to the typifying chemical composition (Table S1), TW strongly differed from GW (Fig. 1). In particular, TW was distinguished by higher content of SO42- (1153–1980 mg/L), Na (average 35.4 mg/L), Sr (12–15 mg/L), Mg (average 0.15 mg/L), total As (up to 351.9 µg/L) and As(III) (up to 209.2 µg/L), Ca (average 0.63 mg/L), Cs (55.1–67.1 µg/L), and B (up to 1277 µg/L). In contrast, GW showed elevated levels of U (1.2–9.4 µg/L), V (up to 16.0 µg/L), NO3- (up to 20.6 mg/L), Cu (0.2–23.3 µg/L), Zn (up to 594 µg/L), and Cr (average 0.8 µg/L). Total As concentration was in the range of 20.5–182.4 µg/L. The significant differences obtained between TW and GW according to the different chemical characteristics of the studied sites, as highlighted by PCA analysis, were confirmed by PERMANOVA and ANOSIM tests (p < 0.01) (Table S2).

In line with previous reports69,70the GW was enriched in As, U, V, and NO3- resulting by the interaction with the volcanic rocks of the Vico district. The natural occurrence of these elements in the GW can be explained by the complexity of the hydrostratigraphy, the structural setting of the area, and the related mixing occurring between water circulating in the basal volcanic aquifer and the fluids that rise from depth, all of which characterize the active geothermal system44,45,46. TW showed very high levels of SO42- and As, in line with previous evidence in the same area as well as in geothermal fields in the USA, Argentina, Mexico, Greece, and Japan71,72,73,74,75 where these elements varied between 0.1 and 1.7 g/L and 0.04–3.4 mg/L, respectively.

Fig. 1
figure 1

Principal Components Analysis (PCA) biplot representing the typifying chemical composition of thermal waters and groundwater. The vector length is proportional to the correlation between the corresponding parameter and the PCA axis 1 and 2. Bar plot shows the contribution of each variable (vector projection values) expressed as the correlation with the x-axis. The genera significantly correlated with x-axis were the main ones responsible for the obtained clusterization. 1, U; 2, V; 3, NO3; 4, Cu; 5, Zn; 6, Cr; 7, Pb; 8, Ni; 9, Al; 10, Se; 11, Cl; 12, Li; 13, Sb; 14, Ba; 15, Be; 16, Fe; 17, K; 18, Mn; 19, F; 20, B; 21, Cs; 22, As(III); 23, Ca; 24, AsTot; 25, Mg; 26, Sr; 27, Na; 28, SO4.

These physicochemical differences not only reflected distinct hydrogeological origins but may also indicate mixing zones between thermal and non-thermal waters. Previous studies have shown that mixing of chemically contrasting water bodies (e.g., surface water with groundwater or contaminated with uncontaminated aquifers) can create sharp redox gradients, transient geochemical conditions, and increased microbial diversity due to overlapping of ecological niches76,77,78. Such environments often support both aerobic and anaerobic metabolisms and foster functional interactions (e.g., sulfur and nitrogen cycling). Chemical gradients observed in this study, may reflect the transitions between groundwater mixing zones and contribute to shaping microbial community composition and functional potential79,80. These observations support the interpretation that environmental heterogeneity driven by mixing processes may act as a key ecological filter, influencing microbial assemblages and driving biogeochemical processes across the aquifer–geothermal interface.

Microbial abundance and taxonomic diversity

Total cell counts were higher in TW (between 0.5 × 105 and 6.8 × 105 cells/ml) than in GW (between 0.1 × 105 and 3.3 105 cells/ml) (Table S1), in line with previous evidence in the same area75,81. The application of flow cytometry analysis allowed the detection of HNA and LNA cells that are considered a constitutive trait of microbial communities in a large variety of aquatic ecosystems48. In this study, the percentage of HNA cells reached up to 93.9% and 7.0% in T and GW, respectively, in line with previous findings82,83,84. Microbial cells with contrasting nucleic acid content represent distinct fractions with functional and ecological relevance. The LNA and HNA cell counts exhibited consistent variations among adjacent aquifer units85. Comparably high percentages of HNA cells, typically larger and with more complex internal structures than LNA cells, were reported in geothermal areas, suggesting an active microbial metabolism in warmer, anaerobic aquifer sections86.The high-throughput sequencing of the 16 S rRNA gene of Archaea and Bacteria allowed to describe the microbial composition in TW and GW (Fig. 2). Overall, bacterial and archaeal members observed in TW and GW were in line with those retrieved in other similar environments19,26. The Archaea domain analysed in TW was mainly represented by unidentified members of phyla Altiarchaeota (16.6–88.9%) and Euryarchaeota (up to 44.0%). A large portion of archaeal reads were not assigned to known microbes, suggesting the occurrence of novel thermophiles, in line with previous evidence in the same area81. The bacterial community of TW was mainly composed of members of phyla Pseudomonadota (range 24–87% of total reads), Campylobacterota (up to 42.9%), and Bacteroidota (up to 54.6%).

Among the GW, Archaea were detected only in G2 sample, where they were represented by members of genera Candidatus Nitrosoarchaeum and Candidatus Nitrosotenuis (31.1 and 35.3% of total reads), suggesting an active role of microbial communities in the nitrogen cycle in the studied aquifer. No amplification of region V3-V5 of the archaeal 16 S rRNA gene was observed in the other GW samples, suggesting an apparent absence of microorganisms belonging toThe Archaea absence in these samples, although unexpected considering the numerous reports in GW environments87,88may reflect site-specific geochemical or ecological conditions which may be not favorable to archaeal community.

The bacterial community in GW was mainly represented by members of the phyla Pseudomonadota (22.3–78%), Nitrospirota (up to 51.8%), Campylobacterota (0–19.4%), and Actinomycetota (up to 14.8%).

Fig. 2
figure 2

Frequency heat-map of: (a) Bacteria (family ≥ 1% relative abundance of total reads in at least one sample); (b) Archaea. The colour intensity in each cell shows the relative abundance of microbial taxa in thermal waters and groundwaters.

A clear differentiation between microbiome composition in TW and GW was found (Fig. 3). The microbial community in T1 more closely resembled those found in GW than in TW, likely due to the unique characteristics of this sample. Specifically, it was collected from a pool at the end of an approximately 100-meter-long artificial channel, where hot spring water had cooled to ambient temperature68. The microbial community observed in this sample was mainly composed of members of families Rhodobacteraceae, Sphingomonadaceae, Flavobacteriaceae, Microbacteriaceae and Paracaedibacteraceae as also observed in several GW samples. Nevertheless, based on bacterial family-level data, a significant difference between TW and GW (without considering T1) was observed (Fig. 3a), as also confirmed by PERMANOVA and ANOSIM tests (p < 0.05) (Table S2). The comparison between NMDS and correspondence analysis biplots helped to graphically synthesize the similarity/dissimilarity of microbial communities in GW and TW (Fig. 3a) as the result of the mixing occurring between water circulating in the basal volcanic aquifer and the fluids that rise from the depth. Indeed, different clusters of samples can be grouped within GW and TW due to the similar typifying bacterial composition (Fig. 3b, c). Specifically, T4, T6, and T7 showed greater similarity among themselves and G7 rather than with other TW due to the co-occurrence of families Pseudanabaenaceae and Lachnospiraceae. Samples T2, T3, and T5 clustered together due to the high percentage of unidentified members of Campylobacterales and Halothiobacillaceae representing the core microbiome with values ranging between10.5–24.5% and 72.9–86.7%, respectively. In contrast, samples G1 and G4 were mainly characterized by unidentified members of class Thermodesulfovibrionia (on average 49.1%) and Sulfurovum (on average 19.2%) also abundant in T7 representing respectively 36.2% and 17.7%of total reads. Lastly, G7 was highly different from the others GW due to the higher occurrence of members affiliated with families Halothiobacillaceae (22.6%) and Clostridiaceae 1 (11.6%). Most likely, the different microbial community composition observed in G7 was related to the slightly different physical-chemical characteristics (i.e., higher T, EC and lower DO than other GW).

Fig. 3
figure 3

a. NMDS ordination plot, based on Bray-Curtis distance matrixes of log (X + 1)-transformed bacterial sequencing data (community at family level ≥ 1% in at least one sample). The stress value (i.e., < 0.2) suggests for an accurate representation of the dissimilarity among samples. b. Correspondence analysis plot based on groundwaters bacterial 16 S rRNA sequencing data. c. Correspondence analysis plot based on thermal waters bacterial 16 S rRNA sequencing data. 1, Blastocatellaceae; 2, Holophagaceae; 3, Corynebacteriaceae; 4, Nocardiaceae; 5, Sporichthyaceae; 6, Intrasporangiaceae; 7, Microbacteriaceae; 8, Nocardioidaceae; 9, Propionibacteriaceae; 10, Streptomycetaceae; 11, Cyclobacteriaceae; 12, Crocinitomicaceae; 13, Flavobacteriaceae; 14, Anaerolineaceae; 15, Ardenticatenales un.; 16, Chloroflexaceae; 17, Chloroplast un.; 18, Phormidiaceae; 19, Pseudanabaenaceae; 20, Thermaceae; 21, Sulfurovaceae; 22, Campylobacterales un.; 23, Clostridiaceae 1; 24, Clostridiales Incertae Sedis; 25, Heliobacteriaceae; 26, Lachnospiraceae; 27, Peptostreptococcaceae; 28, Nitrospiraceae; 29, Thermodesulfovibrionia un.; 30, Omnitrophicaeota un.; 31, Candidatus Peribacteria; 32, Parcubacteria un.; 33, Saccharimonadales un.; 34, Caulobacteraceae; 35, Paracaedibacteraceae; 36, Reyranellaceae; 37, A0839; 38, Beijerinckiaceae; 39, Rhizobiaceae; 40, Xanthobacteraceae; 41, Rhizobiales un.; 42, Rhodobacteraceae; 43, Rhodospirillales un.; 44, Sphingomonadaceae; 45, Acidiferrobacteraceae; 46, Acidithiobacillaceae; 47, Burkholderiaceae; 48, Chromobacteriaceae; 49, Gallionellaceae; 50, Hydrogenophilaceae; 51, Rhodocyclaceae; 52, Halothiobacillaceae; 53, Pseudomonadaceae; 54, Xanthomonadaceae; 55, Spirochaetaceae.

According to previous reports, the main taxa herein retrieved could be mainly involved in nitrogen, sulfur, and As biogeochemical cycles, in agreement with the different physical-chemical water composition87,89,90. Furthermore, the interaction between upwelling deep waters and shallow groundwater has resulted not only in the presence of chemical elements that are normally present at only trace levels in non-thermal waters but also in the co-occurrence of several microbial taxa in both TW and in GW. These mixing zones likely represented ecological hotspots where microbial communities from distinct habitats can interact, facilitating the exchange of metabolic functions and enhancing biogeochemical processes across the hydrogeological interface.Indeed, a core microbiome composed of 95 ASVs retrieved in both TW and GW was herein discovered (Table S3). Among these, 25 ASVs were affiliated with four families: Burkholderiaceae, Caulobacteraceae, Halothiobacillaceae, and Sulfurovaceae (Fig. 4). Among these families, the abundance of the main identified genera was significantly correlated with the physical-chemical parameters characterizing study site (Fig. S2). In particular, ASVs belonging to sulfur-oxidizing bacteria of genus Sulfurovum were highly abundant in either T7 and T8 or G1, G3 and G4. The metabolic potentiality of microorganisms to oxidize sulfur compounds played a predominant role as also suggested by the high occurrence of reads affiliated with family Halothiobacillaceae and genus Thiofaba, mainly in T2, T3, T5, and G7. Moreover, nitrate-reducing chemolithoautotrophs such as Nitratifractor sp. (family Sulfurovaceae), usually found in hydrothermal system91were observed in all TW and in G6 and G7. Furthermore, in both T7 and G1-3-4-6, Nitrospirota affiliated with uncultured Thermodesulfovibrionia were observed, suggesting the occurrence of a complete respiratory pathways to reduce oxygen and nitrate via the dissimilatory nitrate reduction to ammonium20,92. The involvement of microbial communities in nitrate reduction and denitrification was also supported by the occurrence, among the shared ASVs, of reads affiliated with genera Acidovorax, Brevundimonas and Rhodoferax93,94,95 in TW and GW. Members of the four families occurring in the shared core microbiome were also found to resist and metabolize As. For example, the presence of one ASV classified as Herminimonas sp. (family Burkholderiaceae) was herein observed, representing up to 36.6% of total reads in T4 and 10% in G7. The taxonomic assignment of this ASV was additionally carried out by BLASTn algorithm96 and a 100% similarity match was obtained with Herminiimonas arsenicoxydans strain ULPAs1. This bacterium was isolated from industrial sludge heavily contaminated with As and was proved to efficiently oxidize As(III) to less toxic As(V)97,98.

Fig. 4
figure 4

Box plot showing the relative abundance (% of total reads) of the most abundant bacterial families occurring in total reads from TW and GW.

Functional prediction based on bacterial 16 S rRNA sequencing data

According to the peculiar chemical and biological characteristics of analyzed waters, the data elaboration focused on the enzymes involved in S-, As-, N-, and CH4-related biogeochemical cycles. No significant differences (p > 0.05) were observed between TW and GW predicted metagenomes, suggesting that, despite differences in microbial community composition, similar metabolic activities can occur in the Cimino-Vico area (Fig. 5; Table S2; Table S4). Indeed, as previously mentioned, the microbial taxa shared among samples were described in literature for their potential role in S-, As-, and N-cycling. The only exception was T6, which showed a potential involvement in dissimilatory nitrate reduction and denitrification, mainly attributed to the high occurrence of Rhodoferax genus. In all other samples, the S-related predicted metabolic pathways were mostly related to sulfur oxidation, mainly represented by reactions HS- → SO42- and HS- → S0, and anaerobic sulfite reduction represented only by reaction SO3- → HS-. Methanogenesis was mainly acetoclastic, represented by reaction Acetate → CH4, while the predicted key genes for methane oxidation were observed, albeit to a lesser extent, only in G1 and G2 samples. The predicted occurrence of key genes involved in methanogenesis did not imply active metabolism in methane production, but rather reflected the possible presence of a few microorganisms carrying such genes under suboptimal conditions. The capability of microbial communities to resist the high As content naturally occurring in the studied geothermal area was supported by the predicted abundances of As-related KO (Fig. 5, Table S4). Moreover, the potentialities in As(V) reduction and As(III) oxidation were predicted in all water samples.

Fig. 5
figure 5

a. NMDS ordination plot, based on Bray-Curtis distance matrixes of log (X + 1)-transformed typifying relative abundance of predicted KO related to the S-, As-, N-, CH4-related biogeochemical cycles in thermal waters and groundwaters. The stress value (i.e., < 0.2) suggests for an accurate representation of the dissimilarity among samples. b. Correspondence analysis plot based on groundwaters data. c. Correspondence analysis plot based on thermal waters data. a, S-related compounds oxidation; b, As(V) reduction; c, As transcription factors; d, S-related compounds reduction; e, As(III) oxidation; f, As resistance; g, As transporters; h, nitrogen fixation; i, methanogenesis; j, denitrification; k, nitrification; l, methane oxidation; m, assimilatory nitrate reduction; n, dissimilatory nitrate reduction; o, complete nitrification (comammox).

Shotgun metagenomic sequencing of thermal water

The integration of physical-chemical analyses and amplicon sequencing data provided insights into the characteristics of TW and GW systems in the Cimino-Vico volcanic area, highlighting the interaction between upwelling deep waters and shallow groundwater. A shotgun metagenomics analysis was performed on T5 sample to better clarify the metabolic potentialities of microorganisms in this geothermal area. This sample was chosen due to its low level of microbial identification obtained by 16 S rRNA gene sequencing. Specifically, the archaeal reads could not be assigned to known taxa according to the SILVA v.132 database, while the bacteria were mainly identified as members of the family Halothiobacillaceae, but no classification at more deep level was obtained. A total of 7 genome bins were extracted based on the differential genome abundance and using a kmer-based tSNE approach (t-distributed stochastic neighbour embedding). Overall, all genome bins were highly complete (86.8–100%) and contain very low levels of contamination (0–4.5%) as ascertained by the presence of CheckM marker genes. Among all genome bins (Table 2; Fig. S1-S2), the most abundant specie was Halothiobacillaceae, in line with the 16 S rRNA gene sequencing. The number of predicted coding sequences (CDS) ranged from 1238 to 3395. For each bin, a high portion of CDS were classified in COG functional categories (from 32.1 to 49.6%) and around 800 CDS were annotated as hypothetical protein.

Table 2 Sequencing and assembly statistics and PROKKA annotation results. Contigs, number of contigs; genome N50, the shortest contig length needed to cover 50% of the genome; genome size, the total length of each bin; GC, the content (%) of guanine-cytosine (GC) nucleotides; total coding sequences (CDS), number of predicted CDS; matching to COGs (Clusters of orthologous Groups), number of CDS in COG classification; with Enzyme Commission Number; missing CDS, number of CDS not classified in COG.

The major functional categories were similar among extracted bins. The translation, amino acid transport and metabolism, coenzyme metabolism, energy production and conversion, and inorganic ion transport and metabolism were the most abundant functions identified.

A manually curated analysis of CDS mainly annotated in “Inorganic ion transport and metabolism” category was performed for exploring the presence of genes involved in S-, As-, N-, and CH4-related biogeochemical cycles (Fig. 6). Dissimilatory sulfur processes comprise energy-conserving microbial pathways that mediate the transformation of sulfur compounds and contribute significantly to the biogeochemical sulfur cycle in anaerobic environments. It encompasses three main processes:99 sulfur reduction, oxidation, and disproportionation100,101. In line with the potential metagenome obtained by 16 S rRNA gene-based functional predictions, S-related potentialities in T5 sample were mainly related to sulfur oxidation and anaerobic sulfite reduction. In particular, BIN2, BIN3, and BIN4 together have the potential to oxidize sulfur through the reactions HS- → SO42- and HS- → S0 (sox system), while no potentialities were observed in BIN 6 and BIN 7. Moreover, the potentialities to anaerobically reduce sulfite (asr enzymes) was observed in all bins except for BIN1, with BIN2 carrying all the required functional genes (i.e., asrA coding for ferredoxin, asrB coding for NAD(P)H-flavin reductase, asrC and dsvB coding for dissimilatory sulfite reductase). Lastly, as evidenced by both shotgun metagenomics and 16 S rRNA based functional prediction, the inorganic fermentation of sulfur, sulfite, and thiosulfate into sulfide and sulfate, namely called disproportionation102,103seemed to be irrelevant in our system due to the absence of dsr, hdr and psr genes. However, the absence of these genes did not definitively exclude the occurrence of sulfur disproportionation via alternative pathways or microbes not detected by this predictive analysis. In T5 sample, assimilatory sulfur metabolism seemed to play a minimal role in the biogeochemical sulfur cycle because of the presence only in BIN1 of cys enzyme, the central compound for this reaction and subsequent generation of a variety of downstream metabolites. These evidence confirmed the key role of Halothiobacillaceae in this geothermal areaA wide range of microbial taxa are key players in the transformation of both organic (e.g., Cysteine/Cys, Methionine/Met and dimethyl-sulfide/DMS) and inorganic (e.g., SO42−, SO32− and H2S) sulfur compounds, playing key roles in the biogeochemical sulfur cycle and exhibiting broad ecological distribution104,105,106,107,108. Among these, members of Halothiobacillaceae family can play an important role in global carbon and sulfur cycles due to their dependency on inorganic compounds for carbon and energy needs109,110. These taxa and relative sulfur-related activities have been often reported in microbial mats, hydrothermal vents, and hot springs as in Yellowstone National Park111,112,113.

Fig. 6
figure 6

Detailed information and presence of functional enzymes involved in As- S-, CH4-, N-related transformations in each bin. red, present; white, absent.

Among the other biogeochemical cycles expected in these waters, the denitrification metabolic potential was observed in almost all bins, while the nitrogen fixation was observed in two out of the seven bins and the complete nitrification (comammox) only in BIN 2. Overall, nitrogen fixation is widespread in several environments and often associated with carbon fixation in the microbial mat communities114. Furthermore, several studies revealed a strictly interplay between sulfur and nitrogen cycling in aquatic environments highlighting high complexity and synergistic interactions115,116. For example, the “cryptic sulfur cycle” is closely connected to the nitrogen cycle, especially within oxygen-minimum zones (OMZs) where both sulfate and nitrate reduction frequently occur. Defined by the rapid coupling of sulfate reduction and sulfide oxidation without the accumulation of hydrogen sulfide, the cryptic sulfur cycle can influence anammox and other nitrogen cycling processes104,117,118,119. Overall, the microbiome revealed in this sample showed a potential metabolic involvement in acetoclastic methanogenesis. Indeed, the acetyl-coenzyme A synthetase (acs), responsible for the conversion of acetate to acetyl-CoA was reported in BIN1, BIN2, BIN4, and BIN5. The acetate kinase (ackA) was observed in BIN5, BIN6 and BIN7. Lastly, the phosphotransacetylase (pta) was revealed in BIN1, BIN6, and BIN7.

In line with 16 S rRNA gene sequencing results and high As content in the studied area, the shotgun metagenomics analysis showed a strong involvement of microbiome in As cycle. In particular, the potential of As uptake was observed in all bins (Fig. 6). No specific As uptake system exists, so prokaryotic cells can use unspecific and specific phosphate transporters, such as Pit (Phosphate inorganic transport) and Pst (Phosphate specific transport) respectively120or aquaglyceroporins like the glycerol facilitator GlpF121. Furthermore, the occurrence in all bins of ars and acr enzymes, involved in As(V) cytoplasmic reduction, confirms the capability of the microbiome to resist the high As concentration122. Microbes can reduce As(V) to As(III) within the cell membrane by a cytoplasmic As(V) reductase (arsC) through a detoxification process123,124. After As(V) reduction, As(III) can be excreted out of the cells via membrane efflux pumps encoded by arsRBC or acr3 operons125,126. The functional enzymes involved in aerobic As(III) oxidation (aioB) were retrieved only in BIN 3, identified as genus UBA6016 within Campylobacterales order, suggesting an active role of Campylobacterota in As cycling. The potential in methylating As(III) was present in BINS 1-2-6, while no potentialities in As(V) respiratory reduction, anaerobic As(III) oxidation, and As demethylation were observed. This evidence was in agreement with those observed with the 16 S rRNA gene based functional prediction and previous reports in the same area39,40. The occurrence of As in groundwater was determined by geothermal inputs and geogenic processes71,127,128. Thermal waters herein analysed presented a high As concentration of geogenic origin44,45,46with a high content in As(III) as often observed in geothermal systems129. In line with previous studies75,81the microbial communities in this area exhibited a high metabolic potential for As tolerance and its direct transformation, thus revealing high potentialities in the biotechnological application for the treatment of As-rich natural waters47,75,81.

Conclusion

The occurrence of novel thermophiles able to withstand extreme physical-chemical conditions and high concentrations of toxic elements (i.e., As) was observed in the geothermal waters of the Cimino-Vico volcanic area. The presence of As and sulfur compounds at high levels in groundwater was due to the mixing between the water circulating in the basal volcanic aquifer and the fluids that rise from the depth. The interaction between the two water compartments also affected the structure and functioning of groundwater microbial communities. Despite an overall different community composition in TW and GW, a core microbiome composed by families Burkholderiaceae, Caulobacteraceae, Halothiobacillaceae, and Sulfurovaceae was identified as indicators of the interaction between the two water compartments. The combined approach based on the application of amplicon sequencing and shotgun metagenomics allowed to shed light on the involvement of microbial communities in S-, As-, and N-related biogeochemical cycles. Our findings demonstrate the importance of considering microbial interactions and their biogeochemical impacts when assessing the risks and benefits associated with the utilization of geothermal resources.