Introduction

The energy crisis is closely linked to climate change, with carbon emissions from fossil fuels exacerbating the climate crisis1,2,3. The overconsumption of fossil fuels not only amplifies environmental pollution and climate change, but also exacerbates energy supply risks, making the transition to clean and sustainable energy sources increasingly urgent4,5,6.Utilization of clean and green energy can alleviate mitigate climate change and improve energy efficiency, making contributions to sustainable development7,8. In this context, Carbon Capture, Utilization, and Storage (CCUS), geothermal energy, and hydrogen geological storage have emerged as effective energy transition options to reduce carbon emissions and slow down climate change9,10. CCUS technology captures and stores carbon dioxide underground, reducing greenhouse gas emissions11. Geothermal energy, a sustainable energy source, involves extracting thermal energy from the inner earth12 and has the potential to effectively mitigate climate change13,14. Hydrogen serves as a clean and reliable carrier that can reduce CO2 emissions and mitigate climate change15,16. Therefore, the adoption of green energy will promote sustainable development and a healthier future, guiding us toward a cleaner, low-carbon world17.

However, subsurface applications like CCUS, geothermal utilization, and hydrogen geological storage face challenges related to technology maturity, applicability, economic feasibility, and more importantly life cycle deep subsurface monitoring. The main challenge for CCUS is the cost and effectiveness of monitoring the flow path of CO2 to prevent CO2 leaks18,19. Various monitoring techniques can help mitigate these risks20, such as seismic (high-cost, high-benefit), electromagnetic (EM) (high-cost, low-benefit), and controlled source electromagnetic methods (challenges in accurately interpreting deeper subsurface features, limited resolution)21, the latter being underused22. Geothermal energy also encounters challenges in monitoring groundwater recharge and circulation23. Monitoring the chemical properties of geothermal fluids is crucial for understanding fluid-rock interactions, predicting potential scaling or corrosion issues, and assessing the reservoir’s response to production and reinjection24. Hydrogen storage and generation which is critical for decarbonizing hard-to-electrify sectors, a detailed caprock assessment is crucial to prevent H2 leakage, utilizing diverse monitoring techniques and sealing mechanisms for safe storage25,26. Evaluating rock hydrogen capacity, selecting suitable formations, and implementing control measures are critical steps27. Additionally, large-scale and long-term H2 storage remains largely untested28. Surveillance of hydrogen storage and generation site can employ geophysical, geochemical, and microbiological techniques to characterize reservoirs and formation fluids, including well logging and distributed acoustic sensing, these methods have instability and uncertainty in geological description and division29,30. Therefore, life cycle monitoring of deep subsurface fluid flow paths is essential for monitoring gas leakage, fluid flow dynamics, and well performance. Specifically, life cycle monitoring refers to the continuous tracking of subsurface fluid flow paths throughout the entire circulating water injection process in the reservoir, encompassing the initial injection, ongoing circulation, post-injection monitoring, and production stages. Through this approach, we can improve the management of injection rates, pressure, and reservoir stability in CCUS, enhance resource monitoring and environmental impact mitigation in geothermal energy, and ensure the safe and efficient storage of hydrogen. Consequently, developing a dynamic over time, cost-effective method to assess deep subsurface fluid flow paths is paramount to overcoming these challenges and advancing these subsurface applications.

The deep subsurface (the lower boundary of the subsurface biosphere at depths beyond 1 km)hosts unique anaerobic, high temperature (exceeding 80 °C) and high pressure (55.8 MPa) fostering a distinct microbial ecosystem31,32,33,34. This microbial ecosystem exhibits differences in deoxyribonucleic acid (DNA) at various depths, offering versatile subsurface characteristics35. Over the past two decades, DNA sequencing technologies have evolved rapidly36, becoming more cost-effective37. In 2007, the cost of sequencing a complete human genome was about $1 million, but by 2023, this cost had dropped to about $10038,39. These advancements have enabled the potential applications of DNA sequencing in CCUS, geothermal utilization, and H2 geological storage. In CCUS reservoirs, it helps assess CO2 metabolic capacities, monitor subsurface environmental changes, ensure CO2 storage capacity, and detect potential leaks40. It also evaluates the microbial response to the injected CO2, assessing the impact of storage on underground ecosystems41. For geothermal energy, DNA sequencing provides valuable information on how storage and energy extraction processes affect microbial communities. In the context of hydrogen storage and generation, predominant microorganisms responsible for biohydrogen production and storage include Clostridium, Enterobacter, Bacillus, Escherichia coli, thermophilic lactic acid bacteria, and Klebsiella42. Integrating DNA sequencing with CCUS, geothermal, and H2 geological storage is practical and beneficial. Moreover, combining DNA sequencing with assessment of deep subsurface fluids flow pathways can effectively address challenges related to monitoring, uncertainty in geological description and division, and high costs, providing a comprehensive approach to enhance the efficiency and safety of these subsurface applications. Table 1 outlines the definitions, advantages, and disadvantages of three traditional methods for tracking fluids flow pathways, including production dynamics such as production rates, pressure variations, inter-well connectivity, well tests, and tracer tests. These traditional methods often encounter challenges such as complexity, subjectivity, high costs, formation pollution, and monitoring limitations. Future trends should aim to develop cost-effective, environmentally friendly, user-friendly, and long-term monitoring approaches. Integrating tracer and microbiology technologies holds promise for providing a comprehensive solution to these challenges.

Table 1 Advantages and Disadvantages of various methods for assessing fluids flow pathways

Recently, there have been rapid advancements in using microbial DNA for monitoring subsurface fluids flow paths, especially in inter-well communication within reservoirs. DNA-based methods for this analysis can be divided into two main types40. The first method involves injecting DNA wrapped in nanomaterials into subsurface formations and measuring the concentration of the injected DNA in the produced fluids43,44,45. However, this approach can encounter several challenges, including unstable DNA injection, susceptibility to underground microbial influences, and low survival rates of injected DNA46. The second method involves sampling and sequencing DNA from produced fluids at both injection and production wells, then comparing the DNA profiles to assess fluids flow pathways, as shown in Fig. 1. Cyclic water injection is a technique that involves treating produced fluids and reinjecting them into injection wells, dynamically adjusting injection and production parameters to optimize reservoir pressure distribution, mobilize remaining oil, and delay reservoir depletion in high-water-cut fields47. Passive tracer tests are indeed a well-established method for determining inter-well communication and are highly reliable for estimating reservoir volume and connectivity. However, DNA-based analysis offers complementary advantages that conventional tracers do not. Specifically, microbial DNA tracers can provide dynamic, time-resolved insights into subsurface biological activity and fluid pathways that passive tracers cannot capture. This method could enhance reservoir characterization by identifying microbial responses to subsurface conditions, enabling a more detailed understanding of fluid movement and long-term reservoir behavior. For example, Song et al.48 demonstrated the spatiotemporal migration of underground microbial communities through long-term sampling over 5 years. Sawadogo et al.49 evaluated inter-well communication by comparing DNA sequencing results from 12 production wells and 3 injection wells at three time points, finding that communication between wells is influenced by the completion sequence. Kobayashi et al.50 conducted DNA sequencing on samples from 9 production wells and 3 injection wells at five time points over 2 years, though they did not examine the impact of front propagation velocity. Lascelles et al.51 analyzed samples from 2 injection wells and 6 production wells to characterize inter-well communication. Zhang et al.52 observed diverse microbial communities in different fractures within the same formation. Additionally, Ursell et al.53 provided insights for optimizing well layout and spacing by analyzing DNA from fluid production across multiple wells over 189 days.

Fig. 1
figure 1

Schematic diagram of inter-well communication based on microbial DNA information.

Utilizing microbial DNA data from produced fluids enhances the accuracy of communication assessments, simplifies operations, and improves long-term monitoring effectiveness, compared to traditional methods54. Nonetheless, key areas for improvements include: (1) Life cycle monitoring of the whole oilfield block; (2) The identification of microbial tracers for precise communication analysis, where reservoir heterogeneity and cyclic reinjection models are expected to strongly influence fluid flow; and (3) Incorporation of front propagation velocity as well as temporal analysis of samples from injection and production wells at various time periods. This study introduces a method for life cycle monitoring assessment in deep subsurface fluids from circulating water injection using microbial DNA. The term ‘life cycle’ here refers to the entire process of water circulation above and under subsurface, which includes the injection of water into the wells, its underground migration to production wells, sampling from these production wells, and the subsequent re-injection back into the injection wells to continue the circulation into production wells. The approach involves sampling, sequencing, and analyzing microbial composition in sampled fluids over time. By comparing microbial differences in injection and production fluid samples, the method enables dynamic monitoring of fluids flow paths in deep subsurface. Finally, physical simulation experiments and the Kalman Filter (KF) method are established to validate the accuracy of microbial DNA tracing. This study aligns with global carbon-neutral transition and nature-based solutions in life cycle55. This research marks the large-scale temporal well sampling in oil fields in China, creating a comprehensive microbial map with theoretical and practical implications.

Materials and methods

The main methods are divided into three steps: sampling and sample pretreatment, DNA extraction and sequencing, and fluids flow pathways analysis. Initially, fluid samples undergo pretreatment, which includes freezing storage and filtration centrifugation. Next, DNA is extracted, followed by Polymerase Chain Reaction and DNA sequencing. Here we define fluid flow pathways as the dynamic movement of fluids and associated microbial communities between injection and production wells. These pathways can be characterized by analyzing the temporal and spatial similarities in microbial DNA profiles, which reflect the connectivity and transport mechanisms within the reservoir. The analysis focuses on variances in microbial community composition between injection and production fluids samples over time, allowing for the evaluation and validation of fluids flow pathways within the reservoir. Each step is detailed in the following sections. The overall methodology is illustrated in Fig. 2. This approach incorporates the rich diversity of microbial species, thereby expanding the range of parameters considered. It also provides a clearer understanding of subsurface fluid dynamics, which is essential for geoenergy sustainability.

Fig. 2: Flowchart of the main methodology.
figure 2

Including sampling and sample pretreatment, DNA extraction and sequencing, and fluids flow pathways monitoring.

Study site overview and sampling wells

This study conducts extensive site monitoring of the X oilfield in China. The oilfield has been in development for 54 years and currently faces challenges such as high water cut, high degree of oil recovery, and high remaining oil production rate. Produced fluids samples are extracted from the oilfield, which is at the depth of 1180–1300 m. The oil-bearing area covers 3.6 km2, with an effective thickness of 14.3 m and reserves of 6.975 MMbbl. The porosity of core ranging from 30.0% to 35.6% and permeability is 544–1830 mD. The formation pressure is relatively high, averaging 14.62 MPa. Over 2 months, samples are collected four times from both injection and production wells. Sampling covers ~90% of wells in this study site. Figure 3 displays the well map, the wells marked with red dots are production wells for sampling, the wells marked with blue are injection wells. The sampling times and the number of injection and production wells sampled are provided in Supplementary Table 1, while detailed sampling operations are provided in Supplementary Table 2 and flow rates of production and injection wells are provided in Supplementary Table 3.

Fig. 3
figure 3

Well map of the study site.

Sampling procedure and sample pretreatment

Fluids samples from production wells are collected directly from wells, while fluid samples from injection wells are collected from the well valve at the mixing station. Samples are collected in sterile conical flasks, each holding a volume of 30–50 ml. All flasks are deeply cleaned and disinfected with UV light before sampling. Sterilized gloves are worn during sampling to prevent potential contamination. Flasks with fluid samples are then submerged in cryopreservation solution with ice packs to maintain a frozen state in the field, and later after taking a day’s worth of samples then transported to the laboratory. After the samples are collected, we froze them as soon as possible to minimize the possibility of gas exchange. During transportation and storage, the samples are also sealed to prevent the ingress of external gases.

Upon arrival at the laboratory, samples are stored in a −70–80 °C freezer for no more than 1 week until extraction. Carefully remove the samples from the freezer to minimize exposure to room temperature. Place the samples in an ice bath, ensuring they are fully submerged in ice water. Slow thawing reduces temperature gradients and prevents internal structural damage. Check the samples every 10–15 min to ensure even thawing. Once fully thawed, remove the samples from the ice bath. Samples undergo an enrichment centrifugation pretreatment process. The bacterial solution is then filtered using 0.45 µm and 0.22 µm filtration membranes to concentrate the bacterial populations onto the filter for subsequent analysis. Microbial residues of the fluid samples are retained on the filter paper, and the filtration process is carried out on a sterile workbench. Ethanol-disinfected scissors and tweezers are utilized to carefully cut the filter membrane into small strips, which are then transferred into a 50 ml centrifuge tube containing 10 ml of sterile phosphate-buffered saline. The mixture is centrifuged at 14,000 rpm for 5 min. The resulting bacterial precipitate is stored at −70 to −80 °C until DNA extraction, with storage not exceeding 1 week.

DNA extraction and sequencing

DNA from the enriched and centrifuged samples is extracted and sequenced to analyze microbial community composition of each sample. The DNA extraction step involves adding the produced fluid sample to a buffer solution (1 M Tris-HCl (pH 8.0), 0.5 M EDTA (pH 8.0), NaCl, ddH₂O) for cell fragmentation. DNA is then extracted from the cell solution using chemical reagents such as proteases and salts. Subsequently, DNA is precipitated by adding alcohol or other precipitants, and the precipitate is washed to remove residual salts and impurities.

In this study, the V6 region (a variable region) of the 16S rRNA gene was selected for sequencing, using primers U789F (forward primer, sequence TAGATACCBGGTAGC) and U1068R (reverse primer, sequence CTGACGRCRRCCATGC) to amplify the target region for microbial diversity or classification analysis. The 16S rRNA sequencing method is adopted. After sequencing is completed, the raw sequencing data are processed to analyze the microbial community composition.

Assessment of deep subsurface fluids flow paths

Analysis criteria for injection well samples

This study investigates the composition and temporal changes of microbial communities in injection well fluid samples. The analysis requires consideration of several factors, including temporal shifts and dynamic changes in microbial communities, and environmental influences. If the microbial community composition shows a relatively stable trend over time, it suggests that the collected fluid samples have similar microbial characteristics. In such cases, samples from the mixing stations can represent all injection well fluid samples. This representative sampling allows for a more comprehensive understanding of the characteristics and variation patterns of the entire injection well network.

To assess the impact of a 44-day period on microbial communities, we analyze dynamic changes in microbial communities in injection wells at various times. Fluids from the mixing stations of injection wells are collected at four different times and analyzed using high-throughput sequencing technology to determine the microbial community composition.

Screening of bacterial tracers based on microbial community composition

To address the issue of overlapping DNA information from oil-water microbial communities in temporal and spatial dimensions due to the cyclic reinjection of water in the field, the solution involves initially screening and identifying tracer strains. Subsequently, tracking these bacterial tracers provides a foundation for analyzing fluids flow pathways. Microbial community composition analysis is conducted on the collected fluids samples from injection and production wells at four different times to analyze their differences and similarities. The similarity and differences between the fluid samples are analyzed using UPGMA (unweighted pair group method with arithmetic mean) clustering tree.

Analyze variations of microbial communities in samples from injection and production wells to understand microbial diversity characteristics. Identify shared microbial communities in injected and extracted fluids and explore the factors behind these differences and similarities. The similarities lie in the shared genera of microbes present in both injection wells and production wells, which can serve as bacterial tracers. However, it is crucial to note the differences, particularly in terms of microbial relative abundance between injection and production wells. Only by thoroughly investigating these similarities and differences can we effectively screen for microbial tracers.

In China’s high water cut oil fields, the cyclic reinjection and multi-well homologous model leads to a shared microbial community between injection and production wells. Microbial tracers selected for this study must remain stable throughout the reinjection cycle above and under the ground. Representative stable endogenous bacterial tracers should meet two criteria: (1) Stable in both injection and production well fluid samples, and (2) Higher abundance in injection well fluid samples compared to production well fluid samples. Selecting bacterial tracers based on these criteria provides a foundation for understanding oilfield microbial community dynamics and evaluating well communication.

To evaluate the front propagation velocity, it is essential to analyze the concentration curve of bacterial tracers in production wells over time. This curve represents the spreading of tracers during injection and helps determine the optimal sampling time. The bacterial tracers present in the sampled fluids at different times are employed for assessing fluids flow pathways.

Assessment of fluids flow pathways

We use principal coordinate analysis (PCoA) clustering to categorize bacterial tracers within each fluid sample, gauging similarity levels. By calculating Euclidean distance based on microbial species abundance, we construct a similarity matrix for PCoA, unveiling microbial relationships in a lower-dimensional space. The closer a point is to the injection well on the figure, the stronger the communication with the production well. This method allows us to visualize and quantify the communication between wells, with closer points indicating enhanced well connectivity.

Initially, a high-dimensional distance matrix must be computed to quantify the dissimilarity among various genera. This matrix provides a visual and interpretive graphical representation. It is converted into a distance matrix D by D = 1-S, where S is the similarity matrix. In the next step of the algorithm the ijth element of D is modified as described below56:

$${A}_{{ij}}=-\frac{1}{2}{{D}_{{ij}}}^{2}$$
(1)

This gives a matrix where the element is calculated as above, and it has the same dimension as \(D\). A second transformation is then applied on the elements of \(A\):

This process results in a matrix \(A\), where each element ijth is computed according to the aforementioned method, and A shares the same dimensions as D. Following this, a second transformation is applied to the elements of A:

$${E}_{{\mathrm{ij}}}={A}_{ij}-\bar{{A}_{i}}-\bar{{A}_{j}}+\bar{A}$$
(2)

The notations \({\bar{A}}_{i}\), \({\bar{A}}_{j}\) and \(\bar{A}\) stand for row, column and overall average. It can be shown that this transformation preserves the distances relationship between genera \(i\) and \(j\). In the last step, an eigenvalue equation is solved for \(E\), and once the eigenvectors are scaled appropriately, they can be plotted against one another to create a two-dimensional graph.

Produced fluids, often sourced from production wells and reinjected into injection wells using pumps. The average daily injection volume per well is 121 m³/d. Fluid samples from injection wells are collected from the well valve at the mixing station. The experiment is conducted using the produced fluids from injection well X5-4 for DNA sequencing of three distinct samples. The first sample comprises the original produced fluids, amounting to 640 ml. The second sample is derived by allowing the produced fluids to settle through gravity over a 2-day period after they have flowed through a glass tube filled with silica sands (A type of sand composed primarily of silicon dioxide (SiO₂) in the form of quartz crystals). The experimental setup includes a glass tube, which is 1.7 m long and 20 cm in diameter, designed to simulate the permeability—approximately 2000 mD—of a high-permeability reservoir, similar to the one from which the sample is sourced. Through this system, 100 ml of the produced fluid is collected via a valve attached to the tube after 2 days. The remaining fluids are then reintroduced into the tube to settle by gravity for an additional 2 days, resulting in the collection of the third sample of produced fluids. This process mimics the microbial migration that takes place during the circulation of water injection in oil reservoirs. The first sample represents the fluids as they are initially produced from the injection well, while the second sample signifies the fluids that have migrated into the production well. Considering the reservoir’s drilling speed of 200 m per day, the 2-day interval is adequate for microbial migration within the 1.7-m glass tube. The third sample emblemizes the subsequent stage of circulating water injection in oil reservoirs. The connectivity rate among the samples is determined by analyzing the shared Operational Taxonomic Units (OTUs) from the original sample’s OTU set, employing a Venn diagram for visualization. Moreover, the migration and transformation of selected bacterial tracers are examined to assess the feasibility of this technology. The experimental figures and detailed descriptions of samples are presented in Fig. 4 and Supplementary Table 4.

Fig. 4
figure 4

Physical Simulation Experimental Figures.

Typically, the communication between production and injection wells is strongly influenced by the distance between wells. Geological features of the site, particularly the occurrences of discontinuities such as faults, also play a role in determining well-to-well communication. The shorter the well distance is, the higher the chance is to have stronger fluid exchange and thus better connectivity between the wells. In addition, fluid exchange is controlled by reservoir permeability which is affected by factors such as geological structures, rock properties, and fracture systems. Validating fluids flow pathways through microbial DNA sequencing provides an effective confirmation on well connectivity. If DNA sequencing results show that production wells far from the injection well exhibit strong communication, it indicates good inter-well connectivity with high permeability and supports the rich diversity of microbial DNA technique.

In this study, the fluids flow pathways are characterized using the communication coefficient, which can be calculated with the Kalman Filtering (KF) method. This coefficient essentially measures how effectively fluid injected into an injection well reaches a production well, reflecting the level of hydraulic connectivity between the wells. The KF method is a recursive data processing method designed to minimize variance errors by continuously updating input signals without the need to store all data simultaneously57. To calculate the communication coefficient, which signifies the level of influence the injection well has on the production well, the cumulative monthly volumetric liquid production rates (sm3/month) from January 2022 to July 2023 are utilized as input data. Comparing the final communication coefficients obtained through KF with those derived from microbial DNA sequencing validates the accuracy of microbial DNA technique. Since this paper is concerned mainly with the fluids flow pathways monitoring based on microbial DNA tracing, the KF, as a mathematical method for verifying technical accuracy, will not be given in full detail in the text. Interested readers are recommended to refer to more details elsewhere58,59,60.

Results

Microbial community composition of injection well fluids over time

Figure 5a depicts DNA sequencing of injection well fluids at four different times, revealing 17 specific bacterial genera and "other". Most fluids samples show stable microbial communities, with Methanomethylovorans, Desulfacinum, Candidatuscloacamonas, Methanothrix, and Geobacter being prominent, as shown in Fig. 5b. In the figure, the x-axis represents the well name, while T1, T2, T3, and T4 correspond to the 1st, 2nd, 3rd, and 4th sampling times, respectively. The y-axis represents the abundance of each bacterial genus. These microbes exhibit consistent abundance levels over time, indicating shared characteristics among injection well fluids. This suggests that representative samples from a few injection wells at the mixing station may reflect the entire set of injection wells. Our measurements provide insights into the overall characteristics and spatial-temporal changes in the microbial community composition of injection well fluids.

Fig. 5: Microbial community composition of produced fluids by injection and production wells over time.
figure 5

a Composition of microbial communities in the injection well fluids and b abundance of the five most prevalent bacterial genera in the injection well fluids over four sampling times. Community composition in injection and production wells over time: c results of the 1st sampling time, d results of the 2nd sampling time, e results of the 3rd sampling time, and f results of the 4th sampling time. The x-axis shows well names (with blue box indicating injection wells and the others representing production wells) and the y-axis shows genus-level bacterium abundance. g UPGMA cluster tree of fluid samples from injection and production wells over time.

Variations in microbial communities are observed among fluid samples, indicating potential influences from different factors at different times. This indicates that microbial communities in injection wells may vary over time due to various influencing factors, leading to differences in microbial composition. What’ more, Fig. 5a focuses on the variations in abundance of the top 8 most abundant genera in the injection well fluids samples, emphasizing the differences in abundance levels at each sampling time, which reveals that although most samples display consistent microbial communities, there are notable exceptions. For example, while Methanomethylovorans is usually predominant in all samples, Klebsiella is found to be predominant in the 5N5 sample from the second sampling. Klebsiella is known for its roles in nitrogen fixation in plant ecosystems and interactions in various environments61. In addition, it has been identified as part of some insect microbiota and can have multiple associations with other bacteria in polluted environments such as soil, rivers, and wastewater62. The predominant presence of Klebsiella in 5N5 may be due to environmental contamination. In the third and fourth sampling from Well X5_6, Methanococcus stands out in abundance. Methanococcus is a methane-producing microbe that thrives in high-temperature conditions63. We hypothesize that the predominant presence of Methanococcus is likely due to an increase in the water injection and in oil pressure in summer. We have noticed that an increase in the daily injection water volume from 100 to 130 m3 in the third and fourth samplings, and the oil pressure rose from 11.77 MPa to 12.39 MPa, which takes place during the summer when temperatures are higher, is expected to promote the growth and proliferation of methane-producing bacteria, leading to an increase in their abundance. Additionally, in July, the oil pressure in Well X5_6 increases by about 2 MPa, reaching 11.39 MPa. Under high-pressure conditions, methane-producing bacteria are stimulated to grow, resulting in an increase in their abundance as well. The detailed variations in dominant microbial genera across sampling times and environmental influences are shown in Supplementary Table 5.

These abundance bacterial tracers are closely related to CCUS, hydrogen geological storage, and geothermal fields. For example, genera such as Methanococcus, Methanobacterium, Methanomethylovorans, and Methanothrix play key roles in methan production, which is a critical component of CCUS and a potent greenhouse gas64. Methanococcus and Methanobacterium are hydrogenotrophic methanogens that generate methane through the reduction of CO₂65,66. Methanomethylovorans is a methylotrophic methanogen capable of producing methane using methanol and other organic compounds as substrates67. Methanothrix is an acetotrophic methanogen that produces methane from acetate68. In geothermal fields, sulfur-reducing microbial like Desulfacinum is involved in organic matter decomposition, relevant to sulfur reduction fermentation processes69. Additionally, bacteria such as Acinetobacter possess versatile metabolic capabilities, including pathways for the degradation of various long-chain dicarboxylic acids and aromatic compounds, which supports their role in subsurface biodegradation and environmental stability70. Analyzing the abundance of these bacteria enhances the understanding of these processes, which can inform reservoir monitoring and management strategies.

Community composition in injection and production wells and bacterial tracer selection

In this section, we present a differential analysis of microbial community composition in both injection and production wells over time, comparing bacteria and their abundance for each sample. Figure 5c–f illustrates that Methanomethylovorans, Methanococcus, and Desulfacinum are more abundant in injection wells compared to production wells. Conversely, bacterial species like Pseudomonas, Burkholderia, Salmonella, and Akkermansia show lower abundance in injection wells than in production wells.

The UPGMA clustering method is employed to assess the genetic similarities and disparities among the genera present in both the injection and production wells. As shown in Fig. 5g, the left side displays the UPGMA clustering tree, while the right side shows the genera abundance bar graph. In the clustering tree, samples on the same branch with shorter branch lengths exhibit higher similarity in species composition. Figure 5g indicates that the samples from injection wells cluster together in a distinct branch separate from the production well samples, as shown in a blue box. This suggests a clear differentiation in genera between injection and production wells.

The observed differences in microbial community composition between injection and production wells suggest that long-term water injection has fostered a stable and distinct microbial ecosystem within the reservoir. These findings underscore the key role of microbial diversity in shaping these variations. To be effective tracers, bacteria must remain stable throughout the reinjection cycle and exhibit tracer characteristics both aboveground and underground. Representative endogenous bacterial tracers should meet two essential criteria: (1) they must be consistently present in both injection and production wells, and (2) their abundance much be higher in injection wells than in production wells. Based on these criteria, nine bacterial species, including Metanococcus, and Candidatus_Cloacamonas, have been selected as tracers. Their higher abundance in injection wells, as shown in Fig. 6, forms the basis for subsequent inter-well communication analysis, indicates the pathways of the deep subsurface fluids flow. Additionally, the distribution these bacterial tracers in injection and production wells across four sampling times is shown in the panels (a) to (d) in Fig. 6.

Fig. 6: Abundance and par chart of selected bacterial tracers at four sampling times.
figure 6

Abundance of nine selected bacterial tracers at four sampling times a 1st, b 2nd, c 3rd, and d 4th. The pie chart of the abundance ratio of each sample genera is located in the upper right corner of the graph.

Fluids flow pathways between injection and production wells

Principal coordinate analysis (PCoA) results and interpretation

To characterize the arrival time of injected fluids at the production wells, we construct a breakthrough curve using the concentration data of nine bacterial tracers collected on June 14, July 14, and July 27, as shown in Fig. 7a for the production Well 8P406. The x-axis represents the sampling time, and the y-axis denotes the concentration of the tracer strain in the production fluids. For most tracers sampled in Well 8P406, the concentration initially increases and then stabilizes or decreases in the subsequent sampling time. This patten indicates that after 30 days of injection, the injected fluid has reached the production Well 8P406, even though Acinetobacter shows a continuous decrease in its concentration. This inconsistent behavior of Acinetobacter is likely due to its metabolic process that causes its degradation. After 44 days of injection, the concentrations of all tracers are stabilized. The breakthrough curves of bacterial tracers’ concentrations in the injection well are shown in the Fig. 7b. At the initial stage, the relative abundance of tracers in the injection well is the highest. By day 30, as microorganisms migrated into the formation and along the flow path toward the production well, their relative abundances decreased. By day 44, the relative abundance slightly increased, possibly due to reinjection effects. The front propagation velocity can be defined as the speed of the injected fluid from the injection well to the production well. In this case, the injected fluid reaches the production well after 30 days, and the concentrations of all tracers stabilize after 44 days. Therefore, the front propagation in this study is defined as 44 days. To assess the injection-production communication, we conduct a PCoA using samples from the first sampling of injection well fluids and the fourth sampling of production well fluids. This approach helps in understanding the dynamics and interactions between injection and production wells based on the stabilized tracer concentrations.

Fig. 7: Breakthrough curves of bacterial tracers’ concentrations.
figure 7

a Breakthrough curves of bacterial tracers’ concentrations in the production well. b Breakthrough curves of bacterial tracers’ concentrations in the injection well.

Figure 8a shows results of the PCoA, which utilizes a distance matrix constructed from DNA sequencing data, taking into account the front propagation velocity between samples collected at the initial injection well and the production well, 44 days later. In this figure, high-dimensional DNA data is being reduced based on bacterial species and their abundance in well fluids. Here the PCoA utilizes squared Euclidean distance for measuring similarity, Percent Scaler for standardization, and row-wise standard direction for data processing. The PCoA results are categorized into five groups with a 0.9 confidence level. In the visualization, the blue area represents the injection well samples, while the other colors denote the production well samples. In principle, the proximity of points to the injection wells on the PCoA plot indicates stronger communication. We, therefore, conclude that production wells in the pink area, classified as Class 2, show stronger communication with the injection well, suggesting higher microbial similarity and indicating effective tracer movement and interaction between these wells.

Fig. 8: PCoA and Venn results of injection and production wells.
figure 8

a Principal coordinate analysis results. b Venn diagram of bacterial tracers abundance for injection Well 7N3 c 5–4, and d 5N5. The coordinate axis below the Venn diagram represents the total number of OTUs for each well fluids sample.

The PCoA results in Fig. 8a indicate the production wells 6 × 403, 8–3, 8P406, 6–0, 6N7, 8–06, 6–406, and 8–0 exhibiting strong communication to the injection well, while the production wells 7C7, 6-8, and 6–405 showing weak communication with the injection well. The analysis also highlights pair-wise strong communications between certain injection and production wells, such as inj-7N3 with pro-6 × 403, inj-5–4 with pro-8–3, and inj-5N5 with pro-8P406. Here inj denotes the injection well and pro denotes the production well. To further illustrate the overlap of bacterial species between the injection and production wells with varying communication strengths, we plot the Venn diagram using the jvenn library71, which facilitates the visualization of shared and unique elements in complex datasets. This diagram demonstrates the shared and unique bacterial species among the wells, providing a visual representation of microbial community similarities and differences. The PCoA and Venn diagram together offer a comprehensive understanding of microbial dynamics and inter-well communication, aiding in the assessment of fluid flow and reservoir management strategies.

In Fig. 8b–d, wells are categorized based on communication levels. Venn diagrams display the shared and unique OTUs between samples, highlighting the degree of similarity in microbial communities between specific injection and production wells. Each color block in the figure represents a sample, and the overlapping areas between the color blocks indicate the OTUs shared among the corresponding wells. The number of OTUs in each block indicates the number of OTUs contained in that block. The coordinate axis below the Venn diagram represents the number of OTUs contained in each well. The analysis reveals strong communication between the production wells in figure that is closer to the injection wells, as evidenced by high percentages of shared OTUs. The percentage of OTUs shared between the production wells and the injection well, relative to the total number of OTUs in the injection well in Fig. 8b, is defined as the connectivity rate, as shown in Eq. (3). More specifically, for the injection Well 7N3, the production Wells 6 × 403, 6-406, and 8-0 account for 35.3%, 31.1%, and 26.1% of shared OTUs, respectively, as shown in Fig. 8b–d. The study identifies the deep subsurface fluid flow pathways as follows: from Well 7N3 to Well 6 × 403, from Well 5–4 to Well 6 × 403, and from Well 5N5 to Well 8P406.

$${{\rm{The}}}\,{{\rm{connectivity}}}\,{{\rm{rate}}}=\frac{{The}\,shared\,{{\rm{OTUs}}}\,{{\rm{between}}}\,{{\rm{the}}}\,{{\rm{production}}}\,{{\rm{and}}}\,{{\rm{injection}}}\,{{\rm{well}}}}{The\,total\,OTUs\,{{\rm{in}}}\,{{\rm{the}}}\,{{\rm{injection}}}\,{{\rm{well}}}}\ast 100 \%$$
(3)
$${{\rm{Production}}}\,{{\rm{Well}}}\,6{{\rm{X}}}403:\frac{148+63+78+213}{1423}=35.3 \%$$
(4)
$${{\rm{Production}}}\,{{\rm{Well}}}\,6-406:\frac{113+78+213+39}{1423}=31.1 \%$$
(5)
$${{\rm{Production}}}\,{{\rm{Well}}}\,8-0:\frac{63+213+39+57}{1423}=26.1 \%$$
(6)

For the injection Well 5–4, the production Wells 8–3, 8–0, and 6N7 account for 36.2%, 27.3%, and 18.2% of shared OTUs as shown in Fig. 8c.

$${{\rm{Production}}}\,{{\rm{Well}}}\,8-3:\frac{167+46+222+53}{1349}=36.2 \%$$
(7)
$${{\rm{Production}}}\,{{\rm{Well}}}\,8-0:\frac{76+222+53+17}{1349}=27.3 \%$$
(8)
$${{\rm{Production}}}\,{{\rm{Well}}}\,6{{\rm{N}}}7:\frac{46+53+17+130}{1349}=18.2 \%$$
(9)

For the injection Well 5N5, the production Wells 8P406, 6–0, and 8–0 account for 36.2%, 31.9%, and 30.6% of shared OTUs, as shown in Fig. 8d.

$${{\rm{Production}}}\,{{\rm{Well}}}\,8{{\rm{P}}}406:\frac{155+79+70+257}{1551}=36.2 \%$$
(10)
$${{\rm{Production}}}\,{{\rm{Well}}}\,6-0:\frac{88+70+257+80}{1551}=31.9 \%$$
(11)
$${{\rm{Production}}}\,{{\rm{Well}}}\,8-0:\frac{257+79+80+59}{1551}=30.6 \%$$
(12)

Well distance and communication coefficient

The distance between each production well and its nearest injection well is illustrated in Supplementary Fig. 1. The shortest distance of production wells are 7C7, 8–3, 8P406, 6–0, 6–8, 8–0, 6XN3, 7N7, 6–406, 8–6, 8–01, and 6–6. By comparing microbial DNA results with these distances, we can explore the relationship between microbial community composition and well proximity. Microbial DNA sequencing results indicate strong communication for production Wells 8–3, 8P406, and 6–0, which are closer to the injection well. These production wells also demonstrate higher permeability (see Supplementary Fig. 2) and better productivity perform. Conversely, production Wells l7C7, 6–8, and 6–405, despite their proximity to the injection well, show low similarity in microbial communities and low permeability. Notably, production Wells 6 × 403, 6N7, and 6XN3, located farther from the injection well, exhibit high similarity in microbial communities as well as high permeability. These findings confirm the accuracy and feasibility of using microbial DNA analysis to assess deep subsurface fluids flow pathways in life cycle of water circulating injection.

The low communication between Wells 7C7, 6–8, 6N7, and 6 × 403 and the injection wells is likely due to their poor physical properties. Well 6XN3 exhibits poor communication with the injection well, likely due to its adjacent reservoir properties of low fluid transfer capabilities. In contrast, Well 8P406 demonstrates the highest communication based on microbial DNA tracing. This difference behavior between 8P406 and 6XN36 could be caused in changes in fluid interface from July to August, as shown in Fig. 9a, b. The fluid interface is the depth from the surface to the produced fluids level in the wellbore, which is a direct manifestation of the well’s fluids supply capacity. Effective inter-well communication allows injected fluids to efficiently transfer pressure, potentially elevating the fluid interface level in the production well. When fluid is injected into the injection well, if the produced fluids level in the production well shows an upward trend, it indicates strong fluid transfer ability; conversely, if the fluids level decreases, it suggests poor communication. The poor communication of Well 6-405 may be attributed to a fault’s influence, causing discontinuities in rock layers and hindering fluid flow between them, thereby impacting injection and production communication.

Fig. 9: Validation results of physical simulation experiments and Kalman filtering method.
figure 9

Fluid interface level in Wells a 8P406 and b 6XN3. c–d The DNA sequencing results of the Physical Simulation experiments. c The relative abundance of the A1, A2, and A3. d Venn diagram of OTUs for A1, A2, and A3. Inter-well communication coefficients based on the Kalman filtering method: (e) Well 7N3, (f) Well 5–4, and (g) Well 5N5.

The effectiveness of microbial DNA sequencing for evaluating communication between injection and production wells is confirmed when production wells near the injection well exhibit strong communication and high permeability, while distant wells show weak communication and low permeability. Notably, even when very distant wells display strong communication and high permeability, it further validates the sequencing results. This consistency with actual permeability measurements underscores the accuracy and feasibility of using microbial DNA sequencing.

Physical simulation verification—microbial migration and transformation experiment

The microbial compositions from the physical simulation experiment are depicted in Fig. 9c. The presence of microbial tracers such as Methanococcus, Candidatus Cloacamonas, Methanobacterium, Desulfacinum, Methanomethylovorans, Methanothrix, and Acinetobacter in the simulation results indicates their stable existence in both above-ground and subsurface environments during the process of water injection. This stability supports the validity of the selected microbial tracers. This validates the correct of the selection of the microbial tracers. Furthermore, the OTU Venn diagram results are presented in Fig. 9d. The number 22 in the shared area between A1 and A2 signifies that there are 31 shared OTUs between the injection well and the production well to which the injected fluid migrates. The number 22 in the shared area among A1, A2, and A3 indicates that there are 22 shared OTUs among the injection well, the production well receiving the injected fluid, and the well where water circulation is injected. The connectivity rate between A1 and A2 is 33.33%, and the connectivity rate among A1, A2, and A3 is 23.66%, which is close to the connectivity rate observed at the site calculated above. These rates further validate the accuracy and rationality of the method used. The connectivity rate is calculated as follows:

$${{\rm{A}}}1\,{{\rm{and}}}\,{{\rm{A}}}2:\frac{22+9}{93}=33.33 \%$$
(13)
$${{\rm{A}}}1,\,{{\rm{A}}}2\,{{\rm{and}}}\,{{\rm{A}}}3:\frac{22}{93}=23.66 \%$$
(14)

Mathematical method verification—Kalman filter method

Using liquid production rate (in sm3/month) from January 2022 to July 2023 as input and employing Kalman filtering, we obtain the inter-well communication coefficients, shown in Fig. 9. The analysis indicates strong communication between injection Well 7N3 and production Well 6 × 403, as well as between injection Well 5–4 and production Well 8–3, and between injection Well 5N5 and production Well 8P406. These findings align with the assessments based on microbial DNA sequencing, further validating the accuracy of microbial DNA sequencing, as shown in Fig. 9e–g.

Discussion

The system boundary and inventory

When using Life Cycle Assessment to evaluate water circulation injection, it is essential to define the system boundary and create an inventory. The boundaries include the geographic boundary, temporal boundary, lifecycle stages, and well group. The detailed boundaries are as follows:

  1. (1)

    Geographical boundary: The geological structure within the block is relatively simple and gentle, with no complex faults or folding structures, ensuring the stability of fluid flow and microbial migration pathways.

  2. (2)

    Temporal boundary: No well pattern adjustments or changes in injection or production wells were made during the sampling period, ensuring consistent sampling conditions for the analysis of microbial community changes.

  3. (3)

    Lifecycle stages: The whole lifecycle stages include three stages: the initial injection and production wells sampling stage, where both injection and production wells are sampled to establish baseline for the initial microbial community and to monitor changes in the microbial community. And the post-reinjection sampling stage, which is conducted after reinjection to assess the impact of reinjection on the microbial community.

When creating the inventory, it includes recording the initial microbial community by determining the species and abundance through DNA sequencing methods. It also involves monitoring changes in the microbial community over time through regular sampling and analysis.

Key advantages of using DNA sequencing for deep subsurface fluids flow pathways monitoring

Due to the integration of DNA sequencing and data analysis technology, this monitoring and assessment method offers the dual benefits of being low-cost and high-resolution. Moreover, it is capable of tracking the life cycle of geoenergy applications. Only by sampling the produced fluids can we obtain the monitoring and assessment results. More specifically, this study elaborates the life cycle microbial monitoring and assessment of fluids pathways resulting from circulating water injection in reservoirs. In summary, key advantages of this method include:

  1. (1)

    Simplicity: it eliminates the need for injecting chemical or trace elements, thereby avoiding contamination of the formation’s background concentrations. This straightforward approach relies solely on analyzing microbial DNA present in the produced fluids.

  2. (2)

    Cost-effectiveness: It reduces the need for expensive equipment or reagents, making it an affordable option for large-scale applications and research project.

  3. (3)

    Real-time dynamic monitoring: This method allows for continuous assessment by sampling and analyzing microbial DNA at various times. Unlike traditional one-time tracer methods, microbial DNA sequencing enables on-going monitoring, providing real-time insights into the dynamic fluids flow pathways.

  4. (4)

    Environmental safety: By avoiding the injection of foreign tracers, this method mitigates the risk of introducing contaminants into the reservoir, preserving the natural microbial ecosystem.

Potential of using DNA sequencing in Geothermal application, CCUS, and Hydrogen geological storage

At the same time, this monitoring method is not limited by the type of reservoir or varying geological conditions in different regions. According to previous research, microbial communities are still widespread in the deep surface and temperatures exceeding 130 °C72. This shows the monitoring method’s high versatility, even in extreme environments. It demonstrates our ability to track gas or fluid flow pathways in applications like CCUS and hydrogen storage and generation, addressing the issue of gas leakage. It also demonstrates the feasibility of monitoring the fluids in geothermal energy applications, which constitutes the critical link in developing an accurate underground water flow model. Specifically, it offers the following advantages with respective applications:

  1. (1)

    Assessing microbial communities: DNA sequencing provides comprehensive insights into the functions and metabolic capabilities of microbial communities. By analyzing the genetic material of microbes present in reservoirs, engineers can monitor and optimize CO2 storage efficiency and evaluate the effects of CO2 storage and geothermal energy development on underground ecosystems.

  2. (2)

    Enhancing biohydrogen production: The technology helps identify key microorganisms involved in biohydrogen production and storage. Understanding the roles and interactions of these microbes can lead to improved efficiency in biohydrogen processes, making them more viable and productive.

  3. (3)

    Monitoring and optimization: In the context of geothermal energy and hydrogen storage and generation, DNA sequencing allows for continuous monitoring of microbial activity and responses to various interventions. This real-time data is crucial for optimizing operational strategies and ensuring the sustainability of these subsurface applications.

Limitations of DNA sequencing technology for deep subsurface fluids flow monitoring

While DNA sequencing technology offers advantages for monitoring deep subsurface fluids flow pathways, it also has some limitations that need to be addressed to ensure accurate and reliable results.

  1. (1)

    Technological demands: DNA sequencing technology requires sophisticated equipment and advanced technical expertise. Ensuring the accuracy of sequencing data necessitates the use of cutting-edge technology and well-trained personnel, which may not always be readily available, especially in remote field locations.

  2. (2)

    Potential for external contamination: Various sources of contamination include: (a) Microbial Contamination: External microbes can contaminate samples during collection, storage, or analysis, leading to false positives in sequencing results. (b) Tool and Equipment Contamination: Tools and equipment used in the sampling and sequencing processes can introduce contaminants if not properly sterilized and maintained. (c) Rock Surface Contamination: Microbes present on the surfaces of rocks and other materials in the reservoir environment can interfere with the sequencing data, complicating the interpretation of the true microbial community structure.

  3. (3)

    Interpretation Challenges: The presence of interference microbes can make it difficult to distinguish between indigenous microbial communities and contaminants. This requires meticulous sample handling and rigorous data analysis protocols to ensure that the results accurately reflect the in-situ microbial populations and their interactions. For example, it requires high sample quality and DNA sequencing technology, ensuring the accuracy of extracted microbial DNA and sequencing data. In addition, potential interference microbes in the sequencing results may come from external microbial contamination, tool and equipment contamination, rock surface contamination, which may affect the interpretation of the results.

Future work

While microbial tracers currently complement traditional passive tracers and require validation through conventional approaches such as the KF, their potential as an independent reservoir monitoring tool may grow as their accuracy is further established through long-term studies. In our future work, we plan to conduct extensive and frequent sampling of produced fluids from injection and production wells over a longer time frame and on a larger scale. By increasing the frequency of sampling, we aim to collect a substantial number of samples for microbial DNA sequencing. Analyzing these sequencing results should allow us to predict the bacterial types and their abundance in the fluids, which will enable real-time tracking and forecasting of fluid flow paths in reservoirs. This approach will facilitate continuous monitoring and prompt adjustments. The insights gained will be crucial for optimizing oilfield development strategies and enhancing production efficiency. Through a series of simulations and mathematical methods, such as the multiphase poromechanical simulator or the Automated Monte Carlo-based quantification method, the impact and uncertainty associated with geoenergy sustainability can be quantified by monitoring subsurface fluid data73,74. Ultimately, leveraging a substantial dataset of production fluid samples will provide precise data to optimize oilfield development and boost production efficiency, paving the way for more effective management of hydrogeological systems and leading to increased productivity and sustainability in reservoir operations.

Conclusions

This study represents the large-scale, long-term dynamic monitoring of microbial DNA sequencing at a study site in China. Sampling and DNA sequencing are conducted in four stages, totaling 59 samples from 11 injection wells and 26 production wells. By identifying bacterial tracers, we obtain life cycle deep subsurface fluids flow pathways. This method is straightforward, environmentally friendly, cost-effective, and allows for long-term dynamic monitoring, offering valuable insights for life cycle monitoring in CCUS, geothermal application, and hydrogen geological storage. The main conclusions are as follows:

  1. 1)

    Samples from injection wells reveal stable microbial communities like Methanomethylovorans, Desulfacinum and Candidatus_cloacamonas. Differential analysis of microbial compositions over time between injection and production wells indicates notable variations. Prolonged water injection introduces external bacterial populations, leading to a stable microbial community in the reservoir.

  2. 2)

    Nine bacterial tracers, including Methanococcus, Candidatus_Cloacamonas, Methanobacterium, Desulfacinum, Geobacter, Methanomethylovorans, Methanothrix, Fervidobacterium, and Acinetobacter, are identified for this study site. The feasibility of the selected microbial tracers is confirmed by the physical simulation experiment of circulating water injection.

  3. 3)

    The connective rates between the injection wells 7N3, 5–4, and 5N5 to 6 × 403, 8–3, and 8P406 are 35.3%, 36.2%, and 36.2%, respectively, indicating the flow paths. These findings are consistent with mathematical method verification - Kalman filtering results, confirming the accuracy of deep fluids flow pathways.