Deep learning forecasts the spatiotemporal evolution of fluid-induced microearthquakes

Chung, Jaehong; Manga, Michael; Kneafsey, Timothy; Mukerji, Tapan; Hu, Mengsu

doi:10.1038/s43247-025-02644-z

Download PDF

Article
Open access
Published: 07 August 2025

Deep learning forecasts the spatiotemporal evolution of fluid-induced microearthquakes

Communications Earth & Environment volume 6, Article number: 643 (2025) Cite this article

3432 Accesses
4 Citations
7 Altmetric
Metrics details

Subjects

Abstract

Microearthquakes generated by subsurface fluid injection record the evolving stress state and permeability of reservoirs. Forecasting their spatiotemporal evolution is therefore critical for applications such as enhanced geothermal systems, carbon dioxide sequestration and other geoengineering applications. Here we propose a transformer neural network model that ingests hydraulic stimulation history and prior microearthquake observations to forecast four key quantities: cumulative microearthquake count, cumulative logarithmic seismic moment, and the 50th- and 95th-percentile extents of the microearthquake cloud. Applied to the EGS Collab Experiment 1 dataset, the model achieves R² > 0.98 for the 1-s forecast horizon and R² > 0.88 for the 15-s forecast horizon across all targets, and supplies uncertainty estimates through a learned standard deviation term. These accurate, uncertainty-quantified forecasts enable real-time inference of fracture propagation and permeability evolution, demonstrating the strong potential of deep-learning approaches to improve seismic-risk assessment and guide mitigation strategies in future fluid-injection operations.

The temporal and spatial evolution characteristics of induced seismicity in the Changning shale gas field based on dense array

Article Open access 25 October 2024

Stochastic modeling of injection induced seismicity based on the continuous time random walk model

Article Open access 28 February 2024

Crustal permeability generated through microearthquakes is constrained by seismic moment

Article Open access 06 March 2024

Introduction

Subsurface applications for climate mitigation and sustainability are essential to achieving the net-zero emissions target set by the Intergovernmental Panel on Climate Change for 2050¹. Key geo-engineering strategies include the development of enhanced geothermal systems (EGS) for renewable energy generation and the geological storage of carbon dioxide (CO₂) to reduce atmospheric greenhouse gas concentrations. The U.S. Geological Survey (USGS) estimates that EGS could provide over 500 GWe of electricity in the western United States alone². In addition, carbon dioxide sequestration has the potential to store at least 1000 GtCO₂ in saline aquifers, with further storage capacity available in depleted oil and gas reservoirs and coal formations^3,4. Despite the immense potential to reduce greenhouse gases through these subsurface applications, a key challenge is the induced seismicity that can result from fluid injection operations^5,6,7. Fluid injection perturbs in-situ stress fields in the subsurface, potentially leading to the reactivation of preexisting faults or the creation of new fractures, both potentially compromising the integrity of reservoirs⁸. Notable examples include the magnitude 5.7 2015 Prague and magnitude 5.8 2016 Pawnee earthquakes in Oklahoma after wastewater injection^9,10,11,12—and a magnitude 3.9 earthquake following circulation tests for the EGS project in Vendenheim, France¹³. These events underscore the critical need for accurate forecasting of induced seismicity to ensure the safe implementation of subsurface technologies.

Accurately forecasting fluid-induced seismicity remains a challenge due to the complex interactions between geological, hydrological, and mechanical factors^5,14. Traditional approaches rely on physics-based models to estimate induced seismicity by coupling fluid flow, mechanical deformation, and seismicity rates^15,16,17,18. Although these models can capture intricate subsurface interactions, they face limitations in real-world applications. Challenges include uncertainties in fracture geometries, material heterogeneity, and in-situ stress conditions. Moreover, assumptions such as isotropic material properties or idealized fracture networks are often required, reducing predictive accuracy. High computational costs associated with three-dimensional modeling with complex fracture geometries further restrict their use in practical forecasting and operational decision-making^15,17. As a result, discrepancies between modeled and observed seismicity frequently occur.

From a statistical perspective, the Epidemic-Type Aftershock Sequence (ETAS) model provides a forecasting approach for both natural and fluid-induced seismicity, based on the assumption that an earthquake can trigger clusters of aftershocks^19,20. In particular, nonstationary ETAS models have effectively demonstrated their capability in detecting the impacts of fluid-induced seismicity by employing a nonstationary background rate^19,21,22,23. This capability positions ETAS as a valuable tool for generating probabilistic earthquake forecasts. However, determining key parameters, including the timing of peak activity, solely based on statistical analysis has been challenging²⁴. Thus, successful applications of ETAS models to spatiotemporal forecasting of microearthquakes (MEQs) due to fluid injection may be limited.

Data-driven approaches—particularly machine learning—have emerged as powerful complements or alternatives to traditional frameworks including both physics-based and statistical approaches, in a range of geoscientific applications^{25,26,27,28,29,30,31,32,33,34,35,36,37}. These methods do not require detailed prior knowledge of uncertain subsurface properties but instead leverage large datasets from monitoring systems to identify patterns and correlations that can be used for forecasting. For instance, deep learning—with and without physical constraints—was used to forecast the seismicity rate, which was then used to estimate the maximum magnitude of fluid-induced microseismicity³⁸. A bidirectional long short-term memory neural network predicted fluid-induced permeability evolution based on MEQ features, including seismic rate and cumulative logarithmic seismic moment³⁹. In addition, an LSTM model was employed to predict average permeability changes inferred from the seismicity data. Another LSTM model was used to predict pore pressure and associated fault displacements given the fluid injection cycles⁴⁰. These studies demonstrate that deep learning approaches can effectively capture the temporal evolution of permeability or micro-seismicity based on operational parameters. However, they often focus solely on temporal predictions without considering the spatial evolution of MEQs, which is critical for assessing the extent of affected areas and potential impacts. Furthermore, these models rely on simplified assumptions for permeability changes, such as the migration of the triggering front of the MEQ cloud assuming proportionality to the square root of time since the initiation of injection, which is inconsistent with observed MEQ data⁴¹. These idealizations limit the applicability and accuracy of the models in complex scenarios.

Our study advances the forecasting of the spatiotemporal evolution of MEQs induced by hydraulic stimulation using a deep learning approach that tackles these challenges. Specifically, we employ transformer networks, a type of neural network architecture that uses self-attention mechanisms to capture complex dependencies within data sequences^42,43. Compared with recurrent neural networks such as LSTMs, transformer networks can model long-range temporal dependencies more efficiently and are less susceptible to issues like vanishing gradients⁴⁴. Their ability to focus on different parts of the input data through attention mechanisms makes them particularly well-suited for capturing both spatial and temporal patterns in MEQ data. Based on hydraulic stimulation history, our model predicts key MEQ features, including the cumulative number of MEQs, cumulative seismic moment, and the spatial extent of induced micro-seismicity. By incorporating both spatial and temporal information, the model provides more comprehensive forecasts that can inform real-time monitoring and risk mitigation strategies in subsurface activities.

Results

We use hydraulic stimulation data and MEQ history from the EGS Collab^45,46. Figure 1 shows the architecture of our transformer model for forecasting the spatiotemporal evolution of MEQs based on hydraulic stimulation and MEQ histories (see Section “ Method: Transformer neural network architecture ”).

**Fig. 1: Architecture of the transformer-based MEQ forecasting model.**

EGS Collab hydraulic stimulation datasets

We utilize hydraulic stimulation and MEQ data from the EGS Collab project, intermediate-scale (10–20 m) field tests at the Sanford Underground Research Facility in Lead, South Dakota. This study focuses on Experiment 1 data, aimed at producing a fracture network connecting an injection well to a production well via hydraulic fracturing⁴⁷. A series of stimulations and flow tests were conducted at a depth of 1.5 km to re-open and generate hydraulic fractures in crystalline rock under reservoir-like stress conditions, with passive seismic data cataloged⁴⁸ and Continuous Active-Source Seismic Monitoring^45,49,50.

Figure 2 shows the stimulation-induced MEQs for each stimulation event along with the injection and production wells. Two 60 m boreholes were used for injection (E1-I) and production (E1-P), respectively. A total of five stimulation episodes were carried out in May 2018. During the first two stimulations, injections at flow rates less than 1L/min produced few MEQs. In addition, water leakage was observed between the production well and one monitoring well. Thus, the injection point was moved to a notch at a depth of 50 m in the injection hole (red triangle in Fig. 2) starting from Stimulation 3 and used through Stimulation 5. From Stimulations 3–5, three continuous hydraulic stimulations were performed using controlled step-rate injections to re-open or create fractures around the injection well, with the maximum injection rate reaching up to 5 L/min, resulting in rich MEQ signals^46,51. Thus, this study uses data from Stimulations 3 to 5, generated from the same injection point with a rich MEQ history, to train neural networks. The data were recorded at 1-s intervals. Stimulations 3 and 4 each lasted approximately 1 h (3600 time steps), and the first 1 h and 10 min of Stimulation 5 were used (4100 time steps). These continuous records were segmented into overlapping input-output windows for supervised training, validation, and testing, as described in section “Data preprocessing: crop and normalization”.

**Fig. 2: Spatial distribution of fluid-induced microearthquakes during hydraulic stimulations.**

Figure 3 presents the series of stimulations along with the spatiotemporal MEQ data and corresponding magnitudes. Detailed information about the MEQs—including location, time, and magnitudes—was continuously monitored during the hydraulic stimulations^45,46. In addition, to quantify the spatial extent of MEQs in response to fluid injection, we extracted the 95th and 50th percentiles (median) distances of the MEQ clouds from the injection points as a function of time. Although the monitoring array is extensive, the catalog still carries intrinsic uncertainties: hypocenter locations are accurate to about 1 m and there is no reported uncertainty range for magnitude⁴⁵. These uncertainties limit the fidelity of the training data and establish a floor on achievable forecast accuracy. Additionally, including all raw events—without excluding those below the magnitude of completeness—could constrain the neural network’s capability to learn underlying MEQ patterns (Supplementary Fig. 1).

**Fig. 3: Microearthquake and injection history for EGS Collab Stimulation 3-5.**

Forecasting performance

We evaluate three forecast intervals—1 s, 15 s, and 30 s—using a sliding-window strategy. At each forecasting instant t_n, the model ingests the entire monitoring history [t₀, t_n] and predicts subsequent interval [t_n+1, t_n + l_future], where l_future is the forecast range (e.g., 1 s, 15 s, or 30 s). For instance, when using a 15 s range, the model forecasts the next 15 s (e.g., t₁₀₁–t₁₁₅) based on the history data t₁−t₁₀₀. Once actual monitoring for these 15 s is recorded, these new data (t₁₀₁−t₁₁₅) are appended to the monitoring history. The model then uses the extended history t₁−t₁₁₅ to forecast the following segment t₁₁₆−t₁₃₀, and this procedure repeats until the monitoring concludes. Since the model consistently utilizes actual measurements without recycling previously predicted outputs, forecasting errors do not accumulate over successive forecasts (Fig. 4).

**Fig. 4: Schematic of the forecasting procedure.**

Figure 5 compares the forecasted and observed cumulative MEQ counts. For the 1-second forecast model the predicted curves are virtually indistinguishable from the ground truth, even on unseen data (validation R² = 0.999, test R² = 0.980). The 15-second forecast model maintains high fidelity (validation R² = 0.929, test R² = 0.972), with a slight tendency to overestimate MEQ growth during the most intense injection phases. The 30-s forecast model still captures the overall trend but systematically underpredicts the MEQ count late in each episode (validation R² = 0.649, test R² = 0.809). These results show that the transformer delivers excellent short-term forecasts, with accuracy declining gradually as the forecast window lengthens.

**Fig. 5: Cumulative MEQ counts: observed data (black dotted) versus forecasts for the 1-s (blue), 15-s (red), and 30-s (green) models.**

Second, we forecast the cumulative logarithmic seismic moment, a proxy for the activated reservoir volume and thus a key metric for planning new production wells^52,53. The cumulative moment ${{{\mathcal{M}}}}$ is defined as³⁹

$${{{\mathcal{M}}}}({t}_{i})=\int_{{t}_{0}}^{{t}_{i}}\log {M}_{0}\,dt,$$

(1)

with

$$\log {M}_{0}=1.5\,{M}_{w}+13.5,$$

(2)

where M₀ is the seismic moment, M_w the moment magnitude, t₀ the start of injection, and t_i the current injection time.

Figure 6 compares the predicted and observed cumulative moments for the 1-, 15-, and 30-s forecast models across the three data splits. The 1-s forecast model reproduces the observations almost exactly (validation R² = 0.999, test R² = 0.978). Performance remains high at 15-s forecast model (validation R² = 0.878, test R² = 0.935), although the predictive bands widen compared with the 1-s case. At 30-s forecast the model still captures the overall trend but underestimates the released seismic energy (validation R² = 0.546, test R² = 0.765). These results confirm that our neural network effectively links hydraulic-energy input to seismic-energy release, providing reliable short-term estimates of cumulative moment while showing a gradual and interpretable loss of accuracy as the forecast range increases.

Fig. 6: Cumulative logarithmic seismic moment: observed data (black dotted) versus forecasts from the 1-, 15-, and 30-s models (blue, red, green) for the training (left), validation (middle), and test (right) sets.

Accurately forecasting the spatial evolution of MEQ clouds is critical for delineating the affected area, guiding mitigation, and optimizing future well placement¹⁵. Figure 7 compares the spatial extent of the MEQ clouds across the training, validation, and test sets, quantified by the 50th and 95th percentiles of the Euclidean distance from the injection point. The 1-s and 15-s forecast models reproduce the ground truth trajectories of both the median distance (P₅₀) and the far distance (P₉₅), achieving R² > 0.97 for the 1-s forecast model and R² > 0.94 for the 15-s forecast model.

**Fig. 7: Temporal evolution of the MEQ cloud’s spatial extent.**

Figure 8 illustrates the final stabilized extents predicted by these models: absolute errors are below 0.4 m for the 1-s model and below 2 m for the 15-s model (Table 1). For the 1-second case, the observed-predicted differences lie within the model’s ±σ band, indicating that the discrepancies are consistent with the reported uncertainty. In contrast, the 15-s differences exceed σ, revealing the limitations of the mid-range model. The 30-s model underestimates both P₅₀ and P₉₅ in all data splits, highlighting its reduced reliability for long-range spatial forecasts.

**Fig. 8: Spatial evolution of microearthquake (MEQ) clouds and forecast performance.**

Table 1 Final MEQ spatial extent

Full size table

Discussion and conclusion

Our transformer model accurately forecasts fluid-induced MEQs, capturing both their temporal evolution and spatial growth (Table 2). This dual capability is, to the best of our knowledge, novel; earlier studies focused mainly on temporal predictions^39,54. Reliable spatiotemporal forecasts are essential for estimating permeability changes and mitigating the risks associated with induced seismicity. In the following, we discuss how permeability enhancement can be inferred from monitoring data and model outputs, how fracture characteristics can be estimated, and the potentials and limitations of deep-learning-based forecasting for field-scale, fluid-induced earthquakes.

Table 2 R² scores for all metrics and forecast models

Full size table

Estimation of permeability enhancement

Estimating permeability enhancement is a critical task in EGS, yet direct measurements are challenging in the subsurface. This limitation also applies to our study—we aim to understand how permeability evolves during hydraulic stimulation, but no direct measurements were available from the field experiment. Although the correlation between MEQs and permeability remains elusive⁵⁵, we derive a physically grounded rationale to indirectly estimate permeability using model outputs. Specifically, we apply the cubic law for permeability, which relates changes in fracture aperture to permeability change^56,57:

$$\Delta k=\frac{{\left({b}_{0}+\Delta b\right)}^{3}}{12s}-\frac{{b}_{0}^{3}}{12s}$$

(3)

where Δk is the permeability change, b₀ is the initial fracture aperture, Δb is the aperture change, and s is the spacing between parallel fractures. Assuming that the initial aperture b₀ is negligible compared to the aperture change (i.e., b₀ ≪ Δb), we approximate the permeability evolution as $\Delta k\approx \frac{\Delta {b}^{3}}{12s}$. Given that the EGS Collab Experiment 1 aimed to establish fracture networks via hydraulic fracturing (i.e., tensile fractures)^55,58, we assume the seismic moment is linked to normal displacement by tensile opening. The equivalent moment M₀ for a tensile opening can be expressed as³⁹:

$${M}_{0}=2GA\Delta {u}_{n}$$

(4)

where G is the shear modulus, A is the area of the fracture, and Δu_n is the normal displacement across the fracture. Assuming the area A of the fracture is proportional to the aperture (A ∝ Δb)⁵⁹, we establish a direct proportionality between seismic moment M₀ and permeability change as⁶⁰:

$$\log {M}_{0}\propto \frac{2}{3}\log \Delta k$$

(5)

With these scaling relationships, we infer that the overall logarithmic permeability increment is linearly proportional to the logarithmic seismic moment, though this assumption primarily holds during early stimulation, where the initial aperture is substantially smaller than the aperture increment (i.e., b₀ ≪ Δb).

During the first stimulation, the observed cumulative logarithmic seismic moment reaches ≈3 (Fig. 6 left), implying a permeability increase of roughly two orders of magnitude. The 1-s forecast reproduces this estimate, whereas the 15-s forecast model overpredicts the moment by about one order of magnitude, and the 30-s forecast model underpredicts it by a similar amount. Because the cumulative seismic moment predicted by our network can be mapped directly to permeability changes, the model provides a practical, indirect means of tracking permeability evolution during hydraulic stimulation—though this mapping is valid only for the initial seismic—moment range where the derivation’s assumptions hold.

Inference of the fracture characteristics

In fluid injection operations, we need to control the spatial extent of fracturing. As an example, in EGS fields, it is crucial to prevent MEQs from extending beyond the region between injection and production wells while enhancing permeability within this region through fracturing. Our model provides estimates of two spatial extents of MEQs: the 95th percentile distance (P₉₅) and the 50th percentile distance (P₅₀). P₉₅ represents the far extent of MEQs, while P₅₀ indicates the most active MEQ regions, which likely correspond to areas of greatest permeability increase due to fracture generation and re-opening.

The importance of tracking P₉₅ and P₅₀ becomes clear when the spatial extents from each stimulation are compared (Table 1). From stimulation 3 (training) to stimulation 4 (validation), the observed P₉₅ grows by 3.85 m (from 10.23 to 14.08 m), while P₅₀ retreats by 0.21 m (from 5.92 to 5.71 m), indicating a slight shrinkage of the seismically active zone. Our 1-s forecast model reproduces these shifts almost exactly, predicting a 4.27 m increase in P₉₅ (from 10.74 to 15.01 m) and 0.13 m retreat in P₅₀ (from 6.29 to 6.16 m); all absolute errors fall within the 1-s forecast model’s ±σ band. Between stimulation 4 (validation) and stimulation 5 (test), the observed P₉₅ increases by 1.15 m (from 14.08 to 15.23 m), whereas P₅₀ advanced by 4.21 m (from 5.71 to 9.92 m). The 1-s forecast model again captures these trends, predicting a 0.94 m rise in P₉₅ (from 15.01 to 15.95 m) and 4.09 m increase in P₅₀ (from 6.16 to 10.25 m). By accurately forecasting P₅₀ and P₉₅ in real time, the network enables practitioners to infer fracture propagation and activation, making it a practical tool for managing stimulation where direct measurements are not feasible.

Potential and challenges of deep learning forecasting

Among the various deep learning approaches, we chose the transformer model as our core architecture. The success of the transformer model is driven by several key factors. First, the self-attention mechanism allows the model to capture long-term dependencies^42,61,62, which are crucial in fluid-induced seismicity, where MEQs are influenced by cumulative fluid injection, pore pressure changes, and perturbed in-situ stress conditions². In particular, fluid-induced seismicity often exhibits long time intervals between injection and seismicity. For instance, the largest earthquake (local magnitude 3.9) at the deep geothermal site GEOVEN in Vendenheim occurred more than six months after shut-in⁶³. The self-attention mechanism enables the model to weigh the importance of different input features over time, making it highly suited for sequential data⁴⁴.

Second, transformers excel at processing spatiotemporal data⁶⁴, which is vital for accurately predicting the spatial distribution of MEQs. This ability provides critical insights into fracture propagation⁶⁵ and fluid migration⁶⁶, both of which are key factors in assessing the effectiveness of hydraulic stimulation. The model’s performance in predicting the spatial extent of seismic events reflects its capacity to capture both the temporal and spatial dynamics of fluid injection-induced microseismicity. Third, the transformer’s non-recurrent architecture allows it to handle irregular time series data⁶⁷, a common occurrence in microseismic monitoring due to variable injection schedules and operational pauses. This flexibility enhances the model’s robustness across different stimulation phases and geological settings, making it adaptable to varying conditions and data availability—a common challenge in real-world geophysical applications.

While the model shows promising results, extending it to large-scale field operations introduces additional uncertainties due to unknown geological heterogeneity and the extended temporal dependencies inherent to fluid-induced seismicity. The data used in this study were collected from an intermediate-scale (10–20 m) experiment with comprehensive monitoring tools from the EGS Collab project^47,50. Such dense instrumentation may not be feasible in reservoir-scale engineering applications, raising questions about the model’s generalizability to less controlled, large-scale environments. One promising strategy for adapting deep learning forecasting techniques to larger-scale fluid-induced seismicity applications involves transfer learning with fine-tuning. For example, successful transferability between datasets from Utah FORGE and EGS Collab was recently demonstrated using appropriate fine-tuning methods³⁹. Although further fine-tuning will likely be required to adjust the model to larger operational scales, the fundamental assumption remains that the neural network model learns generalizable signal patterns associated with fluid-induced MEQs. Additionally, integrating uncertainty quantification into predictions becomes increasingly important given the higher uncertainty inherent in real-field-scale operations. By incorporating these strategies, along with judicious monitoring, transformer networks could be systematically validated and effectively implemented at larger scales. Future work could involve training and validating the model’s performance with field-scale fluid-induced seismic data and hydraulic stimulation histories, thus ensuring robustness in more complex geological settings.

In summary, despite limitations related to monitoring systems and scale, this study presents a deep learning based approach for forecasting MEQs in response to fluid injection. The transformer model’s ability to predict both temporal and spatial evolution highlights its potential as a valuable tool in subsurface operations, offering substantial improvements in safety and efficiency.

Method: transformer neural network architecture

We employ a transformer neural network to forecast the spatiotemporal evolution of fluid-induced microearthquakes (MEQs). The attention mechanism captures dependencies in the monitoring time series, allowing the model to learn patterns across multiple temporal scales. Figure 1 illustrates the overall architecture. Given a sequence of past monitoring data, the model predicts the future MEQ features. The following subsections describe data processing, network architecture, loss function, and hyperparameter tuning.

Data preprocessing: crop and normalization

We first construct training segments by sliding a growing stimulation history across the cumulative time series and advancing the forecast horizon in non-overlapping blocks. The monitoring data at discrete time index t are defined as:

$${{{\boldsymbol{x}}}}(t)={\left[{x}_{1}(t),{x}_{2}(t),...,{x}_{6}(t)\right]}^{T}\in {{\mathbb{R}}}^{M}\,,$$

(6)

where the monitoring dimension M = 6 includes hydraulic stimulation features —(1) flow rate (x₁) and (2) well head pressure (x₂)— and spatiotemporal MEQ features —(3) cumulative MEQ numbers (x₃), (4) $\log {M}_{0}$ (x₄), (5) 95th percentile distance (x₅), (6) 50th percentile distance)(x₆).

The cropping procedure is controlled by two hyperparameters. The minimum history length ${l}_{\min }$ specifies the number of monitoring samples always available, and the forecast horizon l_future specifies how many future steps are predicted at once. For a monitoring ending at t_end, the number of segments is

$$N=\frac{{t}_{end}-{l}_{\min }}{{l}_{{{{\rm{future}}}}}}$$

(7)

For each segment index k ∈ {0, . . . , N − 1} the split time is set as

$${t}_{k}^{\,{\mbox{split}}}={l}_{\min }+k{l}_{{{{\rm{future}}}}}$$

(8)

Thus, the cumulative monitoring input (X^(k)) and the subsequent forecast window (Y^(k)) are defined as:

$${{{{\boldsymbol{X}}}}}^{(k)}=\{{{{\boldsymbol{x}}}}(t)\,| \,1\le t\le {t}_{k}^{\,{\mbox{split}}\,}\}\in {{\mathbb{R}}}^{{t}_{k}^{{{{\rm{split}}}}}\times M}$$

(9)

$${{{{\boldsymbol{Y}}}}}^{(k)}=\{{{{\boldsymbol{x}}}}(t)\,| \,{t}_{k}^{\,{\mbox{split}}} < t\le {t}_{k}^{{{{\rm{split}}}}}+{l}_{{{{\rm{future}}}}}\}\in {{\mathbb{R}}}^{{t}_{k}^{{{{\rm{future}}}}}\times F}$$

(10)

where F = 4 corresponds to the forecasting MEQ features: (1) cumulative MEQ count, (2) $\log {M}_{0}$, (3) P₉₅, and (4) P₅₀. Each successive segment index k advances the split by l_future, ensuring that the predicted time blocks Y^(k) are non-overlap and contiguous, while the input window grows monotonically. This approach yields continuous, leakage-free forecasting segments that can be applied in real time once at least ${l}_{\min }$ monitoring have been acquired (Fig. 4).

To fairly normalize the data without information leakage from future steps, normalization is applied individually to each input window X^(k). For each monitoring dimension m ∈ {1, . . . , M} and each segment k, we define the normalization using only the known input window as follows:

$${\tilde{{{{\boldsymbol{x}}}}}}_{m}^{(k)}=\frac{{{{{\boldsymbol{x}}}}}_{m}^{(k)}-{\min }_{1\le t\le {t}_{k}^{\,{{\rm{split}}}}}{x}_{m}(t)}{{\max }_{1\le t\le {{t}_{k}^{{{{\rm{split}}}}}}}{x}_{m}(t)-{\min }_{1\le t\le {{t}_{k}^{{{\rm{split}}}}\,}}{x}_{m}(t)},\quad 1\le t\le {{t}_{k}^{\,{{\rm{split}}}}\,}$$

(11)

The normalization parameters obtained from each input window X^(k) are then consistently applied to scale the corresponding forecast window Y^(k). This ensures that normalization relies exclusively on information available at the prediction time, thus avoiding any data leakage from future observations.

Neural network architecture

Our transformer neural network architecture employs a multi-head attention mechanism designed to effectively capture temporal dependencies from variable-length sequences. Given an input monitoring sequence X^(k), the multi-head attention layer processes the input as follows⁴²:

$$\,{\mbox{Attention}}\,({{{\bf{Q}}}},{{{\bf{K}}}},{{{\bf{V}}}})=\,{\mbox{softmax}}\,\left(\frac{{{{\bf{Q}}}}{{{{\bf{K}}}}}^{\top }}{\sqrt{{d}_{k}}}\right){{{\bf{V}}}}\,,$$

(12)

where Q = X^(k)W_Q, K = X^(k)W_K, and V = X^(k)W_V are the query, key, and value matrices, respectively; W_Q, W_K, and W_V are learnable weight matrices; d_k is the dimension of key vectors.

Following the attention layer, a feed-forward network (FFN)⁶⁸ is applied independently to each time step. The FFN consists of two linear transformations with a Rectified Linear Unit (ReLU) activation function:

$$\,{\mbox{FFN}}\,({{{\bf{z}}}})=\,{\mbox{ReLU}}\,({{{\bf{z}}}}{{{{\bf{W}}}}}_{1}+{{{{\bf{b}}}}}_{1}){{{{\bf{W}}}}}_{2}+{{{{\bf{b}}}}}_{2}\,,$$

(13)

where z denotes the input from the attention output, and W₁, W₂, b₁, and b₂ are learnable parameters.

To enhance training stability, layer normalization and residual connections are applied after both attention and feed-forward layers. These ensure effective gradient propagation and prevent training instabilities.

After attention and feed-forward layers, global average pooling and dense layers reduce the sequence to a single vector, producing predictions for the forecasting window Y^(k). In particular, the model predicts both the mean (μ) and log-variance ($\log {\sigma }^{2}$) of these forecasting MEQ features to quantify prediction uncertainty:

$${{{{\boldsymbol{y}}}}}_{{{{\rm{pred}}}}}\in {{\mathbb{R}}}^{{l}_{{{{\rm{future}}}}}\times 2F},\quad {{{{\boldsymbol{y}}}}}_{{{{\rm{pred}}}}}(t)=[{\mu }_{1}(t),\,...,\,{\mu }_{F}(t),\,\log {\sigma }_{1}^{2}(t),\,...,\,\log {\sigma }_{F}^{2}(t)]$$

(14)

The model is trained using the Adam optimizer⁶⁹ with a heteroscedastic Gaussian negative log-likelihood (NLL) loss function^70,71, augmented by a monotonicity penalty weighted by the hyperparameter (λ):

$${{{\mathcal{L}}}}=\,{\mbox{NLL}}\,({{{{\bf{y}}}}}_{{{{\rm{true}}}}},{{{{\bf{y}}}}}_{{{{\rm{pred}}}}})+\lambda \,{{\mbox{Penalty}}}_{{{{\rm{mono}}}}}$$

(15)

The NLL explicitly measures the discrepancy between predictions and true values, accounting for predictive uncertainty. Given the predicted mean (μ_pred and log-variance ($\log {\sigma }_{\,{\mbox{pred}}\,}^{2}$), the NLL is defined as:

$$\,{\mbox{NLL}}\,({{{{\bf{y}}}}}_{{{{\rm{true}}}}},{{{{\bf{y}}}}}_{{{{\rm{pred}}}}})=\frac{1}{2NF}\mathop{\sum }_{i = 1}^{N}\mathop{\sum }_{f = 1}^{F}\left[\frac{{({y}_{i,f}^{{{{\rm{true}}}}}-{\mu }_{i,f}^{{{{\rm{pred}}}}})}^{2}}{{\sigma }_{i,f}^{2}}+\alpha \log \left({\sigma }_{i,f}^{2}\right)\right],$$

(16)

where N is the number of time steps in the forecast window, F is the number of MEQ target features, and α is the hyperparameter to discourage the model from inflating variance. This formulation captures both prediction accuracy and model confidence, penalizing over- or under-confident forecasts.

To enforce non-decrease for the cumulative term forecastings, a monotonicity penalty is applied to cumulative MEQ count and cumulative logarithmic seismic moment. The penalty is defined as:

$${{\mbox{Penalty}}}_{{{{\rm{mono}}}}}=\mathop{\sum }_{t = 2}^{T}\left| \min \left(0,\underset{\,{\mbox{pred}}\,}{\overset{t}{{{{\bf{y}}}}}}-{{{{\bf{y}}}}}_{\,{\mbox{pred}}\,}^{t-1}\right)\right| \,,$$

(17)

where only the selected cumulative features are included in the penalty term.

Finally, all predictions are rescaled using the inverse of the normalization applied during preprocessing. The model performance is evaluated using the coefficient of determination (R²):

$${R}^{2}=1-\frac{\mathop{\sum }_{i = 1}^{n}{\left({Y}_{i}-{\hat{Y}}_{i}\right)}^{2}}{\mathop{\sum }_{i = 1}^{n}{\left({Y}_{i}-\bar{Y}\right)}^{2}}$$

(18)

where Y includes the four spatiotemporal MEQ features.

Neural-network hyper-parameter tuning

The transformer model is trained to forecast spatiotemporal MEQs from hydraulic-stimulation history and past MEQ responses. While network weights are learned automatically, several settings—loss-function coefficients, architectural widths, batch size, dropout rate, and penalty weights—must be chosen by the user^72,73. Supplementary Table 1 lists the values that remain fixed in every experiment.

Two coefficients are tuned by grid search: β (the variance-regularization weight inside the heteroscedastic Gaussian NLL term) and λ (the weight on the monotonic-increase penalty applied to cumulative MEQ count and cumulative seismic moment). For each forecast horizon l_future ∈ {1, 15, 30} models are trained with β, λ ∈ {0.1, 1.0, 10.0}. Validation R² scores identify the optimal pair (β^⋆, λ^⋆); the corresponding results appear in Supplementary Table 2.

Short-horizon models—forecast windows of up to fifteen seconds— achieve excellent accuracy; for example, the l_future = 15 model reaches ${R}_{\,{\mbox{val}}\,}^{2}=0.924$. As the horizon lengthens, performance degrades: at n_future = 30 the best model attains ${R}_{\,{\mbox{val}}\,}^{2}=0.046$. The horizon-specific models reported in Supplementary Table 2 are used for all subsequent experiments.

Data availability

The EGS Collab experiment’s stimulation data and seismic catalog are available at https://doi.org/10.15121/1651116 and https://doi.org/10.15121/1557417.

Code availability

The code used in this study is available on GitHub at https://github.com/jh-chung1/Transformer_MEQ_Forecasting.

References

Metz, B., Davidson, O., De Coninck, H., Loos, M. & Meyer, L. IPCC Special Report on Carbon Dioxide Capture and Storage (Cambridge University Press, 2005).
Williams, C. F., Reed, M. J., Mariner, R. H., DeAngelo, J. & Galanis, S. P. Assessment of Moderate-and High-Temperature Geothermal Resources of the United States. Technical Report (Geological Survey (US), 2008).
Bachu, S. & Adams, J. J. Sequestration of CO₂ in geological media in response to climate change: capacity of deep saline aquifers to sequester CO₂ in solution. Energy Convers. Manag. 44, 3151–3175 (2003).
Article CAS Google Scholar
Damen, K., Faaij, A. & Turkenburg, W. Health, safety and environmental risks of underground CO₂ storage–overview of mechanisms and current knowledge. Clim. Change 74, 289–318 (2006).
Article CAS Google Scholar
Rutqvist, J. The geomechanics of CO₂ storage in deep sedimentary formations. Geotech. Geol. Eng. 30, 525–551 (2012).
Article Google Scholar
Yeo, I., Brown, M. R., Ge, S. & Lee, K. Causal mechanism of injection-induced earthquakes through the M_w 5.5 Pohang earthquake case study. Nat. Commun. 11, 2614 (2020).
Article CAS Google Scholar
Ellsworth, W. L., Giardini, D., Townend, J., Ge, S. & Shimamoto, T. Triggering of the Pohang, Korea, earthquake M_w 5.5 by enhanced geothermal system stimulation. Seismol. Res. Lett. 90, 1844–1858 (2019).
Google Scholar
Wang, C.-Y., Manga, M., Shirzaei, M., Weingarten, M. & Wang, L.-P. Induced seismicity in Oklahoma affects shallow groundwater. Seismol. Res. Lett. 88, 956–962 (2017).
Article Google Scholar
Rajesh, R. & Gupta, H. K. Characterization of injection-induced seismicity at north central Oklahoma, USA. J. Seismol. 25, 327–337 (2021).
Article Google Scholar
Johann, L., Shapiro, S. A. & Dinske, C. The surge of earthquakes in Central Oklahoma has features of reservoir-induced seismicity. Sci. Rep. 8, 11505 (2018).
Article Google Scholar
Hincks, T., Aspinall, W., Cooke, R. & Gernon, T. Oklahoma’s induced seismicity strongly linked to wastewater injection depth. Science 359, 1251–1255 (2018).
Article CAS Google Scholar
Manga, M., Wang, C.-Y. & Shirzaei, M. Increased stream discharge after the 3 September 2016 M_w 5.8 Pawnee, Oklahoma earthquake. Geophys. Res. Lett. 43, 11–588 (2016).
Article Google Scholar
Fiori, R., Vergne, J., Schmittbuhl, J. & Zigone, D. Monitoring induced microseismicity in an urban context using very small seismic arrays: the case study of the Vendenheim EGS project. Geophysics 88, WB71–WB87 (2023).
Article Google Scholar
Shapiro, S. A. Fluid-Induced Seismicity (Cambridge University Press, 2015).
Boyet, A., Vilarrasa, V., Rutqvist, J. & De Simone, S. Forecasting fluid-injection induced seismicity to choose the best injection strategy for safety and efficiency. Philos. Trans. A 382, 20230179 (2024).
Google Scholar
Lu, J. & Ghassemi, A. Coupled thermo–hydro–mechanical–seismic modeling of EGS collab experiment 1. Energies 14, 446 (2021).
Article CAS Google Scholar
McClure, M. W. & Horne, R. N. Investigation of injection-induced seismicity using a coupled fluid flow and rate/state friction model. Geophysics 76, WC181–WC198 (2011).
Article Google Scholar
Zhai, G., Shirzaei, M., Manga, M. & Chen, X. Pore-pressure diffusion, enhanced by poroelastic stresses, controls induced seismicity in Oklahoma. Proc. Natl Acad. Sci. 116, 16228–16233 (2019).
Article CAS Google Scholar
Kumazawa, T. & Ogata, Y. Nonstationary ETAS models for nonstandard earthquakes. Ann. Appl. Stat. 8, 1825–1852 (2014).
Ritz, V. A. et al. Pseudo-prospective forecasting of induced and natural seismicity in the Hengill geothermal field. J. Geophys. Res. Solid Earth 129, e2023JB028402 (2024).
Article Google Scholar
Hainzl, S. & Ogata, Y. Detecting fluid signals in seismicity data through statistical earthquake modeling. J. Geophys. Res. Solid Earth 110, B05S07 (2005).
Kumazawa, T. & Ogata, Y. Quantitative description of induced seismic activity before and after the 2011 Tohoku-Oki earthquake by nonstationary ETAS models. J. Geophys. Res.: Solid Earth 118, 6165–6182 (2013).
Article Google Scholar
Petrillo, G., Kumazawa, T., Napolitano, F., Capuano, P. & Zhuang, J. Fluids-triggered swarm sequence supported by a nonstationary epidemic-like description of seismicity. Seismol. Res. Lett. 95, 3207–3220 (2024).
Article Google Scholar
Aochi, H., Maury, J. & Le Guenan, T. How do statistical parameters of induced seismicity correlate with fluid injection? case of oklahoma. Seismol. Soc. Am. 92, 2573–2590 (2021).
Google Scholar
Qin, Y., Chen, T., Ma, X. & Chen, X. Forecasting induced seismicity in Oklahoma using machine learning methods. Sci. Rep. 12, 9319 (2022).
Article CAS Google Scholar
Zhang, W. et al. Application of machine learning, deep learning and optimization algorithms in geoengineering and geoscience: comprehensive review and future challenge. Gondwana Res. 109, 1–17 (2022).
Article Google Scholar
Chung, J., Ahmad, R., Sun, W., Cai, W. & Mukerji, T. Prediction of effective elastic moduli of rocks using graph neural networks. Comput. Methods Appl. Mech. Eng. 421, 116780 (2024).
Article Google Scholar
Camps-Valls, G., Tuia, D., Zhu, X. X. & Reichstein, M. Deep Learning for the Earth Sciences: A Comprehensive Approach to Remote Sensing, Climate Science and Geosciences (John Wiley & Sons, 2021).
Maniar, H., Ryali, S., Kulkarni, M. S. & Abubakar, A. Machine-learning methods in geoscience. In Proc. SEG International Exposition and Annual Meeting, SEG–2018 (SEG, 2018).
Bergen, K. J., Johnson, P. A., de Hoop, M. V. & Beroza, G. C. Machine learning for data-driven discovery in solid earth geoscience. Science 363, eaau0323 (2019).
Article Google Scholar
Yu, S. & Ma, J. Deep learning for geophysics: current and future trends. Rev. Geophys. 59, e2021RG000742 (2021).
Article Google Scholar
Mousavi, S. M. & Beroza, G. C. Deep-learning seismology. Science 377, eabm4470 (2022).
Article CAS Google Scholar
Zhu, W. & Beroza, G. C. PhaseNet: a deep-neural-network-based seismic arrival-time picking method. Geophys. J. Int. 216, 261–273 (2019).
Google Scholar
Reichstein, M. et al. Deep learning and process understanding for data-driven earth system science. Nature 566, 195–204 (2019).
Article CAS Google Scholar
Anikiev, D. et al. Machine learning in microseismic monitoring. Earth-Sci. Rev. 239, 104371 (2023).
Article Google Scholar
Jinqiang, W., Basnet, P. & Mahtab, S. Review of machine learning and deep learning application in mine microseismic event classification. Min. Miner. Depos. 15, 19–26 (2021).
Mousavi, S. M., Horton, S. P., Langston, C. A. & Samei, B. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression. Geophys. J. Int. 207, 29–46 (2016).
Article Google Scholar
Li, Z., Eaton, D. W. & Davidsen, J. Physics-informed deep learning to forecast m^ max during hydraulic fracturing. Sci. Rep. 13, 13133 (2023).
Article CAS Google Scholar
Yu, P. et al. Crustal permeability generated through microearthquakes is constrained by seismic moment. Nat. Commun. 15, 2057 (2024).
Article CAS Google Scholar
Mital, U., Hu, M., Guglielmi, Y., Brown, J. & Rutqvist, J. Modeling injection-induced fault slip using long short-term memory networks. J. Rock Mech. Geotech. Eng. 16, 4354–4368 (2024).
Hummel, N. & Shapiro, S. A. Nonlinear diffusion-based interpretation of induced microseismicity: A Barnett shale hydraulic fracturing case study. Geophysics 78, B211–B226 (2013).
Article Google Scholar
Vaswani, A. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).
Zeng, A., Chen, M., Zhang, L. & Xu, Q. Are transformers effective for time series forecasting? Proc. AAAI Conf. Artif. Intell. 37, 11121–11128 (2023).
Google Scholar
Zhou, H. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. In Proc. AAAI Conference on Artificial Intelligence, vol. 35, 11106–11115 (2021).
Schoenball, M. et al. Creation of a mixed-mode fracture network at mesoscale through hydraulic fracturing and shear stimulation. J. Geophys. Res. Solid Earth 125, e2020JB019807 (2020).
Article Google Scholar
Fu, P. et al. Close observation of hydraulic fracturing at EGS Collab Experiment 1: fracture trajectory, microseismic interpretations, and the role of natural fractures. J. Geophys. Res. Solid Earth 126, e2020JB020840 (2021).
Article Google Scholar
Kneafsey, T. J. et al. An overview of the EGS Collab project: field validation of coupled process modeling of fracturing and fluid flow at the Sanford Underground Research Facility, Lead, SD. In Proc. 43rd Workshop on Geothermal Reservoir Engineering, vol. 2018 (Curran Associates, Inc., 2018).
Qin, Y. et al. Source mechanism of khz microseismic events recorded in multiple boreholes at the first EGS Collab Testbed. Geothermics 120, 102994 (2024).
Article Google Scholar
Feng, Z. et al. Monitoring spatiotemporal evolution of fractures during hydraulic stimulations at the first EGS collab testbed using anisotropic elastic-waveform inversion. Geothermics 122, 103076 (2024).
Article Google Scholar
Kneafsey, T. J. et al. EGS Collab project: status and progress. In Proc. 44th Workshop on Geothermal Reservoir Engineering (Stanford University, 2019).
Kneafsey, T. J. et al. The EGS Collab project: learnings from experiment 1. In Proc. 45th Workshop on Geothermal Reservoir Engineering, 10–12 (Stanford University 2020).
Rothert, E. & Baisch, S. Passive seismic monitoring: mapping enhanced fracture permeability. In Proc. World Geothermal Congress 25–29 (European Association of Geoscientists & Engineers, 2010).
Baisch, S., Vörös, R., Weidler, R. & Wyborn, D. Investigation of fault mechanisms during geothermal reservoir stimulation experiments in the Cooper Basin, Australia. Bull. Seismol. Soc. Am. 99, 148–158 (2009).
Article Google Scholar
Li, Z., Elsworth, D., Wang, C. & EGS-Collab. Induced microearthquakes predict permeability creation in the brittle crust. Front. Earth Sci. 10, 1020294 (2022).
Article Google Scholar
Kneafsey, T. et al. The EGS Collab project: Outcomes and lessons learned from hydraulic fracture stimulations in crystalline rock at 1.25 and 1.5 km depth. Geothermics 126, 103178 (2025).
Article Google Scholar
Witherspoon, P. A., Wang, J. S., Iwai, K. & Gale, J. E. Validity of cubic law for fluid flow in a deformable rock fracture. Water Resour. Res. 16, 1016–1024 (1980).
Article Google Scholar
Ouyang, Z. & Elsworth, D. Evaluation of groundwater flow into mined panels. Int. J. Rock Mech. Min. Sci. Geomech. Abstracts, 30, 71–79 (1993).
Morris, J. P. et al. Experimental design for hydrofracturing and fluid flow at the DOE EGS collab testbed. In Proc. ARMA US Rock Mechanics/Geomechanics Symposium, ARMA–2018 (ARMA, 2018).
Olson, J. E. Sublinear scaling of fracture aperture versus length: an exception or the rule? J Geophys Res. Solid Earth 108, 2413 (2003).
Ishibashi, T., Watanabe, N., Asanuma, H. & Tsuchiya, N. Linking microearthquakes to fracture permeability change: the role of surface roughness. Geophys. Res. Lett. 43, 7486–7493 (2016).
Article Google Scholar
Chen, Z., Ma, M., Li, T., Wang, H. & Li, C. Long sequence time-series forecasting with deep learning: a survey. Inf. Fusion 97, 101819 (2023).
Article Google Scholar
Nie, Y., Nguyen, N. H., Sinthong, P. & Kalagnanam, J. A time series is worth 64 words: long-term forecasting with transformers. arXiv preprint. (2022).
Lengliné, O. et al. The largest induced earthquakes during the geoven deep geothermal project, Strasbourg, 2018–2022: from source parameters to intensity maps. Geophys. J. Int. 234, 2445–2457 (2023).
Article Google Scholar
Giuliari, F., Hasan, I., Cristani, M. & Galasso, F. Transformer networks for trajectory forecasting. In Proc. 25th International Conference on Pattern Recognition (ICPR), 10335–10342 (IEEE, 2021).
Gischig, V. S. Rupture propagation behavior and the largest possible earthquake induced by fluid injection into deep reservoirs. Geophys. Res. Lett. 42, 7420–7428 (2015).
Article Google Scholar
Bhattacharya, P. & Viesca, R. C. Fluid-induced aseismic fault slip outpaces pore-fluid migration. Science 364, 464–468 (2019).
Article CAS Google Scholar
Chen, Y. et al. Contiformer: continuous-time transformer for irregular time series modeling. Adv. Neural Inf. Process. Syst. 36 (2024).
Svozil, D., Kvasnicka, V. & Pospichal, J. Introduction to multi-layer feed-forward neural networks. Chemometr. Intell. Lab. Syst. 39, 43–62 (1997).
Article CAS Google Scholar
Kingma, D. P. Adam: a method for stochastic optimization. arXiv preprint. https://doi.org/10.48550/arXiv.1412.6980 (2014).
Nix, D. A. & Weigend, A. S. Estimating the mean and variance of the target probability distribution. In Proc. IEEE International Conference on Neural Networks (ICNN’94), vol. 1, 55–60 (IEEE, 1994).
Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proc. Advances in Neural Information Processing Systems. Vol 30 (Curran Associates, Inc., 2017).
Feurer, M. & Hutter, F. Hyperparameter optimization. Automated Machine Learning: Methods, Systems, Challenges 3–33 (Springer, 2019).
Yang, L. & Shami, A. On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415, 295–316 (2020).
Article Google Scholar

Download references

Acknowledgements

J.C. gratefully acknowledges the support of the Ingenuity: Next Generation Nuclear Waste Disposal Internship program, funded by the U.S. Department of Energy, Office of Nuclear Energy, and Office of Spent Fuel and Waste Disposition. This work was supported by the US Department of Energy (DOE), the Office of Nuclear Energy, Spent Fuel and Waste Science and Technology Campaign, and by the US Department of Energy (DOE), under Contract Number DE-AC02-05CH11231 with Lawrence Berkeley National Laboratory.

Author information

Authors and Affiliations

Department of Geophysics, Stanford University, Stanford, CA, USA
Jaehong Chung & Tapan Mukerji
Department of Earth and Planetary Science, University of California Berkeley, Berkeley, CA, USA
Michael Manga
Energy Geoscience Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Timothy Kneafsey & Mengsu Hu
Department of Energy Science and Engineering, Stanford University, Stanford, CA, USA
Tapan Mukerji

Authors

Jaehong Chung
View author publications
Search author on:PubMed Google Scholar
Michael Manga
View author publications
Search author on:PubMed Google Scholar
Timothy Kneafsey
View author publications
Search author on:PubMed Google Scholar
Tapan Mukerji
View author publications
Search author on:PubMed Google Scholar
Mengsu Hu
View author publications
Search author on:PubMed Google Scholar

Contributions

J. C. Conceptualization, Methodology, Investigation, Visualization, Writing—original draft, Review and editing. M. M. Conceptualization, Investigation, Supervision, Review and editing. T. K. Conceptualization, Investigation, Supervision, Review and editing. T. M. Investigation, Supervision, Review and editing. M. H. Conceptualization, Investigation, Funding acquisition, Project administration, Supervision, Review and editing.

Corresponding author

Correspondence to Mengsu Hu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Earth and Environment thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Marisol Monterrubio-Velasco and Joe Aslin. [A peer review file is available].

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chung, J., Manga, M., Kneafsey, T. et al. Deep learning forecasts the spatiotemporal evolution of fluid-induced microearthquakes. Commun Earth Environ 6, 643 (2025). https://doi.org/10.1038/s43247-025-02644-z

Download citation

Received: 10 December 2024
Accepted: 29 July 2025
Published: 07 August 2025
Version of record: 07 August 2025
DOI: https://doi.org/10.1038/s43247-025-02644-z

This article is cited by

Coupled processes in fractured media: a key to the energy transition
- Qinghua Lei
- Chin-Fu Tsang
GeoEnergy Communications (2025)