Introduction

The ultimate goal of a regional earthquake early warning (EEW) system (e.g., ShakeAlert1 and TRUAA2) is to rapidly predict ground motion or seismic intensity across an entire region, enabling the mindful issuance of earthquake warnings. To achieve this, the most common approach is to evaluate the location and magnitude of an earthquake upon detection, from which regional ground motions are predicted1. To stay ahead of the most destructive seismic phases (secondary body waves and surface waves), a common strategy is to place sensing instruments near known fault systems3, detect and process the data starting from P-phase arrival, and leverage the fact that modern data transmission can be faster than seismic wave propagation4.

Distributed acoustic sensing (DAS) holds great promise for seismology in general5,6 and specifically for earthquake early warning7,8,9,10,11. One DAS interrogator can sense the strain or strain rate across an optic fiber with a length of approximately 100 kilometers with unprecedented temporal and spatial resolutions12. DAS allows for array-based processing, which is hard to achieve at scale with conventional arrays, such as f-k analysis13,14,15,16, beamforming9,17,18,19, and backprojection20,21,22. Once deployed, fibers require minimal maintenance and can withstand harsh environments, making DAS exceptionally promising for EEW, especially for hard-to-access areas.

Poor instrumentation coverage often leads to two acute challenges for EEW: erroneous epicenter locations23,24,25, and delayed detection and alert issuance7,8,26 for events that occur outside the seismic network or when recorded by sparse networks. A common worldwide example is the sparse to absent ocean-bottom instrumentation, which significantly lags behind continental coverage. Since many earthquakes nucleate below the seafloor, current spatially biased monitoring results in lower quality locations and consequently source property estimates, as well as delays in alert issuance. DAS has great potential in filling this observation gap.

Despite all of its merits, DAS has some EEW-related limitations and unique characteristics compared to conventional instrumentation. Those limitations can be categorized into those caused by inherent DAS operational principles and those that arise from the relatively short history of DAS recordings compared to conventional seismic data.

DAS measures unidirectional strain (or strain rate) along the fiber’s longitudinal axis. This imposes directional sensitivity that can hinder the accurate measurement of seismic phases that are not polarized parallel to the fiber’s longitudinal axis27, with the extreme case of a signal below noise levels when the wave is polarized perpendicular to the fiber’s axis. This differs from common three-component point sensors (i.e., seismometers, accelerometers, GNSS), which measure the full ground motion field, albeit with limited spatial resolution. For example, P-phase detection on a vertical component of seismometers, a routine task in seismology, becomes challenging with horizontal fibers. This is especially evident in ocean bottom deployments where the abrupt velocity contrast between the bedrock and the ocean bottom sediments produces near-vertical incidence angles28, and scattered waves dominate the DAS signal29,30. Another limitation is the dynamic range that may saturate for strong strain rate31,32. This yet to be fully resolved subject is expected to be significant for short hypocentral distances and large earthquakes, and is thus crucial for the proper functioning of DAS-based EEW. Saturation can be minimized by reducing the gauge length32 and optimizing the fiber location relative to expected sources.

Since regional EEW systems rely on seismic data from a network of many single stations, and since the amount of data acquired by each station is small enough, raw sensor data is usually transmitted in real-time to a central computation center1. The high resolution of DAS, both in space and time, yields immense amounts of data, necessitating processing the data at the site of measurement. DAS interrogator deployments should be viewed as a mini data and computation center that aggregates data from thousands of virtual seismic stations, located up to 100 kilometers from the interrogator. Data transfer latencies are nearly zero, as back-scattered light travels at the speed of light in the fiber. Furthermore, many DAS interrogators are equipped with high-performance hardware, such as GPUs, making computationally expensive operations feasible at the site. This will allow future integration of EEW algorithms directly into the interrogator hardware, further minimizing latencies.

In recent years, great effort has been made to develop the real-time methods needed to harness DAS for EEW, namely (1) detection, (2) location, (3) magnitude estimation, and (4) ground motion prediction. Most efforts have focused on developing the first three methods. The last one, ground motion prediction only requires an earthquake location and magnitude33 as input for a Ground Motion Model (GMM), and can thus be used as in standard EEW with no DAS-specific adaptation.

In order to address real-time earthquake detection, PhaseNet-DAS34, a deep learning model designed to pick phase arrival times using DAS data, was developed. A recent study8 showed that this model requires retraining on local data to operate optimally in a new deployment or when the acquiring parameters are changed. Others10 used a simple short-term-average to long-term-average (STA/LTA) approach for pseudo real-time phase picking.

Real-time earthquake location methods generally differ in their reliance on phase picking and the use of a velocity model. Given sufficiently robust phase picking, several studies used phase arrival times to fiber channels for grid-search based earthquake location8,10,35, which depends on a velocity model. Others proposed location methods that do not rely on phase picking, such as back-migration21, or back-projection22, a detection and location approach that may be applicable in real-time. A beamforming-based detection-location algorithm applicable in real-time was suggested9 as a velocity model-free approach. Few studies8,9 also used P- and S-phases arrival time difference to constrain the epicentral distance.

For real-time earthquake magnitude estimation and ground motion prediction, two primary strategies have been proposed using DAS data, physical7 and empirical36. The physics-based approach for real-time magnitude estimation and ground motion prediction7, relies on real-time conversion of strain-rate to ground motion, accounting for the different slowness of different seismic phases. The empirical approach36 provides real-time magnitude estimation directly from raw DAS strain-rate data. This method requires global and channel-specific calibration based on a local earthquake dataset8,36.

The combined EEW workflow of earthquake detection, location, and magnitude estimation has been demonstrated8,35 using deep-learning-based detection, travel-time-based location, and empirical magnitude estimation. The next final step, ground motion prediction, was left to be part of the already operational ShakeAlert EEW system.

Although great efforts in recent years have brought us closer to operational DAS-based EEW systems, the reliance on empirical approaches for detection and magnitude estimation necessitates a prolonged calibration phase, including the acquisition of earthquake datasets, before a DAS-based EEW system can become operational. This problem is aggravated by the scarcity (or absence) of large hazardous earthquakes recorded by DAS. Furthermore, empirical magnitude estimation approaches are calibrated using available small-to-medium sized earthquakes, while meant to be used for large earthquakes. It remains unclear whether such relations reliably extrapolate to larger earthquakes.

Here we present a real-time feasible DAS-based EEW system that relies on physics-based approaches, designed for fast deployment and requires little calibration, a significant advantage with the short history record of DAS, especially in areas of low seismicity rate (e.g., the East Levant region), where the seismic risk can be high but relevant data is scarce. The system is composed of three modules: (1) a detection-location module (modified from ref.9 for real-time operation), (2) a magnitude estimation module7, and (3) a ground motion prediction module7. This work further develops beamforming for real-time earthquake detection and location, including all data processing, phase association and runtime efficiency. We further combine the modified beamforming with previously developed magnitude estimation and ground motion prediction modules7 to form a complete DAS-based EEW system. Earthquake detection and location are achieved using a single module based on time-domain beamforming. We search for the arrival of coherent seismic phases, mark their arrival times, and identify the direction to their origin9. Magnitude estimation requires the conversion of strain rate to acceleration using the apparent slowness, followed by the calculation of acceleration root-mean-squares (rms)7. Magnitude is then estimated using a theoretical equation based on the omega squared model37 subject to high-frequency attenuation38. The magnitude and estimated location are then input to a GMM based on the same model assumptions used for magnitude estimation39,40. The ground motion model predicts peak ground motion values surrounding the epicenter. We demonstrate the system’s performance and real-time feasibility using local earthquakes with magnitudes ranging from \(3.1 \le \textrm{M} \le 3.6\), and further discuss the advantages of the physics-based approach. The evaluation of the system’s performance over a long continuous data period is left for future work.

DAS EEW system

The real-time EEW system is composed of three modules (Fig. 1):

  1. 1.

    Detection-location module

  2. 2.

    Magnitude estimation module

  3. 3.

    Ground motion prediction module

Fig. 1
figure 1

Schematics of the DAS-EEW algorithm. This process occurs upon the arrival of each new data packet, and progresses based on the results of the current and past time steps.

In this subsection, we provide an overview of the algorithms, while details are provided in the following subsections.

The approaches we use utilize the spatial density of the measurements. The fiber is divided into two sets of sub-segments for subsequent array processing (Fig. 2b). Long overlapping segments with arbitrary geometry are used as sub-arrays in the detection-location module (color shades in Fig. 2b). Several short linear segments (black squares in Fig. 2b), sparsely distributed across the fiber, are used to convert strain-rate to ground acceleration for the magnitude estimation module.

Fig. 2
figure 2

(a) Map of the study area. The fiber is depicted by a red line. Thin purple lines depict mapped active faults41. Green triangles depict the locations of the accelerometers used for the system validation. Red stars depict the catalog epicentral locations of the earthquakes \(\textrm{M}=3.3\) March 23 2024, \(\textrm{M}=3.4\) August 08 2024, \(\textrm{M}=3.1\) August 14 2024 and \(\textrm{M}=3.6\) March 13 2024. Inset - the study area is marked with a red rectangle. (b) Detailed fiber layout. The color shading marks the overlapping long fiber segments that are used for beamforming in the detection-location module. Black squares mark the location of the short fiber segments that are used for strain-rate to acceleration conversion in the magnitude estimation module.

Detection and location are inherently combined in our algorithm via the beamforming approach. Upon the arrival of a new data packet, we first search for slowness and backazimuth pairs that maximize the semblance (a coherency measure) of a finite time window of data9. Next, we pick the phases as the time with maximal coherency (semblance value). We associate the picks of each seismic phase across all available long segments to declare an event. Once an event is declared, we evaluate the epicenter location using an aggregation of back-azimuth beams and P-phase to S-phase arrival time difference from all available segments. The location and picks are updated continuously as new data packets arrive.

Magnitude estimation and ground motion prediction require a location estimate. Once an initial location estimate is available, we calculate the epicentral distance to the short segments, where the strain rate was converted to acceleration. Acceleration rms is continuously calculated starting at the P-wave arrival. The magnitude estimation module7 uses the maximum acceleration rms value and the epicentral distance, to evaluate the earthquake magnitude. Magnitude is estimated in each short segment for each new incoming data packet, and is then averaged between the short segments, yielding one global magnitude estimation.

Finally, the ground motion prediction module takes the most recent magnitude and epicenter location estimations as input, and predicts peak ground acceleration (PGA) and peak ground velocity (PGV) as a function of distance from the epicenter.

In the following, we detail the components of each module.

Beamforming detection-location

The beamforming42,43 location method is generally adapted from Ref.9 for real-time operation. The fiber is divided into individual segments of 501 channels (4.55 km), with an overlap of 250 channels (Fig. 2b), with 28 segments in total for the fiber used in this study. Each fiber segment is treated as a standalone sub-array18. Upon the arrival of a new data packet (1 second long, sampled at 100 Hz), we low-pass the last available 2.42 s of data with a time-domain moving average filter with a cutoff frequency of 5 Hz. The length of the examined time window is chosen such that it will fully capture a wavefront traveling at the maximum examined slowness (\(\mathrm {SLO_{max}} = 0.42 ~\mathrm {s~km^{-1}}\)) across the entire segment length (\(L = 4.55~\textrm{km}\)) with an extra 0.5 second as a buffer (\(4.55 \cdot 0.42 + 0.5 \approx 2.42 ~\textrm{sec}\)). We then examine the filtered data window for slowness (\(\textrm{SLO}\)) and back azimuth (\(\textrm{BAZ}\)) at each time sample, yielding a semblance44 time series per each BAZ-SLO combination:

$${\text{semb}}(t,{\text{BAZ}},{\text{SLO}}) = \frac{1}{N}\frac{{\left[ {\sum\nolimits_{{j = 0}}^{{j = N}} {\dot{\varepsilon }_{j} (t - \tau _{j} ({\text{BAZ}},{\text{SLO}}))} } \right]^{2} }}{{\sum\nolimits_{{j = 0}}^{{j = N}} {\dot{\varepsilon }_{j} (t - \tau _{j} ({\text{BAZ}},{\text{SLO}}))} ^{2} }},$$
(1)

where N is the number of channels in the segment, \(\dot{\varepsilon }_{j}(t)\) is the strain-rate trace of channel j, and \(\tau _j(\textrm{BAZ},\textrm{SLO})\) is the time shift that correspond to specific \(\textrm{BAZ}\)-\(\textrm{SLO}\) combination. The search space of \(\textrm{BAZ}\) spans 360 degrees with 2 degrees spacing, and the \(\textrm{SLO}\) is 0.1 to \(0.42 ~\mathrm {s/km}\) with 0.02 spacing.

We only consider the highest semblance value, and its associated slowness and semblance for each analyzed 2.42 s interval. We define a possible phase pick at a time step if it meets two successive conditions. First, the possible pick yields a semblance value above a predefined threshold (\(\textrm{semb}_{\textrm{min}} = 0.15\)), and the associated semblance peak has a minimal width of 3 time samples (to avoid picking Common Mode Noise). We chose the semblance threshold value to be slightly above the average maximum semblance values of noise (see Supplementary Figs. S1-S2). Second, to verify that the phase pick is also associated with an amplitude increase, expected of an earthquake signal, we introduce an additional amplitude threshold condition. For the specific \(\textrm{SLO}\)-\(\textrm{BAZ}\) pair identified for that phase pick, we calculate the squared stacked amplitude across a segment (the numerator of Eq. 1 for specific \(\textrm{SLO}\)-\(\textrm{BAZ}\) pair), and define the pick’s amplitude as the average over a 0.2 s interval centered around the phase pick. The pre-pick amplitude is obtained as follows: for the nine 2.42 s time windows preceding the phase pick, we calculate the numerator of Eq. (1) for all \(\textrm{SLO}\)-\(\textrm{BAZ}\) pairs, average the results in time (per time window), take the maximum regardless of \(\textrm{SLO}\)-\(\textrm{BAZ}\) values, and average this maximum value for all nine preceding data intervals. Finally, we require that the signal’s average amplitude (around the phase pick) be at least 5 times larger than the average pre-pick amplitude (over nine data intervals). This amplitude thresholding can be seen as a variant of the regular \(\mathrm {STA/LTA}\) conditioning, tailored for delay-and-sum stacked traces. Depending on the segments’ geometry, there is ambiguity in SLO-BAZ space around the maximum semblance value. Hence we define the contour of 80% of the maximum semblance as equally plausible SLO-BAZ combinations. For each possible pick, we save the timing (\(t_{\textrm{pick}}\)), maximum semblance value (\(\textrm{semb}(t_{\textrm{pick}})\)), mean slowness (\(\textrm{SLO} (t_{\textrm{pick}})\)) in the 80 % area and a backazimuths range (\(\mathrm {BAZ_1}(t_{\textrm{pick}})\) and \(\mathrm {BAZ_2}(t_{\textrm{pick}})\)) enclosing the 80 % area.

To filter out sporadic picks that are likely to be erroneous (see Supplementary Figs. S1S2), we use a simple spatio-temporal association rule45 across the entire fiber. We demand that the time difference between the picks of every two fiber segments (represented by the location of their middle channel) is no longer than the direct P-wave travel time between the two (\(\Delta t_P \le r/V_P\); \(V_P = 5 \mathrm {km/s}\), and r is the distance between the segments). For each segment, we count the number of segments it is associated with, excluding segment couples that are distanced by more than 25 km (\(\Delta t_P=5\) seconds). If the number of associated segments is greater than a predefined threshold (7 segments), we declare a P-phase arrival and that \(t_{\textrm{phase}}^{P} = t_{\textrm{pick}}\) at that segment. Since picks are added continuously, we allow the update of already associated segments if a new pick has a higher semblance value and it is within less than 2.42 seconds (the examined time window) from the original associated pick. The latter condition aims to keep the updated pick associated. If a segment is not associated yet, we allow an update if a new pick has a higher semblance value, without restriction on the time difference.

After the P-phase arrival is declared in a segment, we search for a second pick that has a higher mean slowness (\(\textrm{SLO}(t_{\textrm{pick}})\)) than that of the \(t_{\textrm{phase}}^{P}\) pick and declare it as the S-phase arrival (\(t_{\textrm{phase}}^{S}\)). Again, we filter out erroneous S-phase picks via a spatio-temporal association rule. For S-phase association, we approximate a maximal time difference between segment couples by multiplying the already measured cross-segment P-phase time differences by a \(\Delta t_S/\Delta t_P = 2\) factor, representing a conservative slowness ratio between the phases.

The process described above is performed continuously, and every time a new segment is declared associated (either for P- or S-phase), we use the corresponding backazimuths range (\(\mathrm {BAZ_1}(t_{\textrm{pick}})\) and \(\mathrm {BAZ_2}(t_{\textrm{pick}})\)) and add a corresponding beam to an epicenter location score map. Every beam is assigned a weight (w) that depends on the corresponding square of the semblance value (\(w = (1+\textrm{semb}^2)\)). On the location score map, every beam is masked at a radius of \(r_\textrm{mask} = 15\) km around the segment’s center to avoid erroneous intersections of beams of neighboring segments. The minimal epicentral distance from a segment is consistent with the plane wave assumption of our beamforming approach. This is also consistent with a horizontal array (fiber) vanishing beamforming sensitivity for sources that are located directly beneath it42, as the absence of elevation difference between the sensors hinders the ability to solve a strictly upward moving wavefront. This rule does not reduce our ability to locate earthquakes that are very close to the fiber, since farther segments can still point to the correct location, even if it lies within a short distance from other fiber segments.

If both P- and S-phase associated picks are available for a segment, the S-P time difference can be represented as a ring surrounding the fiber segment, with width corresponding to distance uncertainty. Previous studies 46,47 used an S-P time difference to epicentral distance ratio of 1 s per 8 km. We use the same ratio and define the inner and outer radius of the ring (representing distance uncertainty) as 1 s per 7 and 9 km, respectively. The ring is added to the location score map and weighted by the mean of the corresponding P and S weights (\(w_{SP} = (w_P+w_S)/2\)). This constraint on the epicentral distance is crucial for locating off-network events, which are defined by a narrow azimuthal range between the fiber and the epicenter9. There are other methods to locate earthquakes using sparse seismic observations48,49,50, yet their use for one-component real-time DAS data needs to be established.

Magnitude estimation

We apply a physics-based magnitude estimation approach7, based on the Brune Omega-squared model37 subject to high frequency attenuation38. The input for this model is the peak RMS of ground acceleration within the available data interval, starting from the P-phase arrival. Since DAS measures strain rate (\(\dot{\varepsilon } (t)\)), rather than ground acceleration, a real-time conversion procedure is first applied, as detailed below. This magnitude estimation approach was validated in a former study7 for events with larger magnitudes than presented in this study.

Strain rate to ground acceleration conversion

We choose five short linear segments (380 m long) along the fiber. We use linear segments to obtain well-defined apparent phase slowness, and we use short segments in order to capture the short coherency lengths of scattered waves. The short segments are chosen such that they are as equally spaced as possible along the fiber, do not show any continuous high noise characteristics, and are not muted during a well-recorded seismic event (the \(\textrm{M}=3.6\) event). We note that we chose only five short segments for computational efficiency, while additional segments are expected to yield more stable and robust results. See further details in the discussion.

Using slant-stack7 per short fiber segment, we continuously calculate the apparent slowness (\(S_{\textrm{app}}\)) that maximizes the semblance per time sample. This procedure produces an apparent slowness time series, which is then smoothed using an exponential moving average51 with a lowpass cutoff frequency of \(f_c \approx 0.2\) Hz (implemented using smoothing factor \(\alpha = 0.0125\)). Then, the strain-rate in the middle channel of the short segment is converted to acceleration as7,52:

$$\begin{aligned} A(t) = \frac{1}{|S_{app}(t)|} \dot{\varepsilon } (t) , \end{aligned}$$
(2)

where \(|S_{app}(t)|\) is the absolute value of the smoothed time-dependent apparent slowness.

Magnitude estimation

The seismic moment (\({M}_{0}\)) and the magnitude (\({M}_{W}\)) can be analytically estimated from band-limited ground motion accelerations as7:

$$\begin{aligned} & {M}_{0}=\frac{1}{27{a}_{1}}{\left( \frac{{a}_{4}}{{2}^\frac{1}{3}}+\frac{{2}^\frac{1}{3}{a}_{2}^{2}}{{a}_{4}}+{a}_{2}\right) }^{3}, \end{aligned}$$
(3)
$$\begin{aligned} & {M}_{W}=2log_{10}\left( \frac{{a}_{4}}{{2}^\frac{1}{3}}+\frac{{2}^\frac{1}{3}{a}_{2}^{2}}{{a}_{4}}+{a}_{2}\right) -\frac{2}{3}log_{10}\left( {a}_{1}\right) -7.05, \end{aligned}$$
(4)

where the coefficients are:

$$\begin{aligned}&{a}_{1}=113014\left[ \frac{{k}^{2}{U}_{\varphi \theta }}{{C}^{3}}\right] {\Delta \tau }^\frac{2}{3}\frac{1}{R\sqrt{T}} \nonumber \\&{a}_{2}={A}_{rms} \nonumber \\&{a}_{3}=1828968\left[ {k}^{2}\right] \Delta {\tau }^\frac{2}{3}{A}_{rms} \nonumber \\&{a}_{4}={\left( 3\sqrt{3\left( 27{a}_{1}^{4}{a}_{3}^{2}+4{a}_{1}^{2}{a}_{2}^{3}{a}_{3}\right) }+27{a}_{1}^{2}{a}_{3}+2{a}_{2}^{3}\right) }^\frac{1}{3}. \end{aligned}$$
(5)

The predetermined parameters in Eq. 5 (Table 1) are a constant (k), the average radiation pattern (\(U_{\phi \theta }\)), the wave velocity at the source (C) and assumed stress drop (\(\Delta \tau\)). The parameter groups enclosed in brackets in Eq. 5 are phase-specific (P or S). The real-time evaluated variables are the hypocentral distance (R assuming a fixed hypocentral depth of 10 km), the length of the time window starting at P arrival (\(T = t-t^ {phase}_P\)) and the maximum RMS of ground acceleration signal in the time window T (\(A_{rms}\)).

Table 1 Ground motion model parameters.

We monitor the evolution of the RMS of ground acceleration at each middle channel of the designated short segments (upon which the conversion to ground motion was calculated), starting from \(t^{phase}_P\), which is interpolated between the long segments where the actual picking took place. For each data packet (of 1 second duration) that follows \(t^{phase}_P\), we evaluate the maximal RMS value \(\max (A_{rms}(t))\), such that even if the RMS is decreasing for a certain time (e.g., in the P coda), the magnitude evaluation is not affected (if the epicenter location is fixed, the magnitude estimation never decreases). Once the S phase pick is available \(t^{phase}_S\), we incorporate the difference in arrival times (\(\Delta _{SP} = t^{phase}_S-t^{phase}_P\)) into the magnitude estimation scheme, weighting the phase specific parameters in Eq. 5 by the fraction of the P or S phase interval relative to the total duration of the signal.

The omega-squared model describes the full ground motion field (i.e., 3 components), while DAS only records one component (in our case, horizontal). To compensate for the weak DAS sensitivity to a broadside incidence angle, we apply a correction factor of 2 to the measured \(A_{rms}(t)\) during P (assuming an average incidence angle of 45 degrees relative to the fiber axis27,53). To compensate for the partitioning of S-waves onto two horizontal components, we apply a correction factor of54,55 \(\sqrt{2}\) to the measured \(A_{rms}(t)\) during S.

Ground motion prediction

Using the same source model and the same parameters (Table 1) used for the magnitude estimation, PGA and PGV are predicted using a physics-based GMM7:

$$\begin{aligned} & PGA=3.3{M}_{0}^{1/3}{\Delta \tau }^{2/3}\frac{{\beta }_{A}}{R\sqrt{{\kappa }_{0}\left[ \frac{1}{k{C}_{S}}{\left( \frac{7}{16}\frac{{M}_{0}}{\Delta \tau }\right) }^\frac{1}{3}+\frac{R}{{C}_{S}}\right] }{\left[ 1+{1.5}^{-\frac{1}{4}}\pi {\kappa }_{0}k{C}_{S}{\left( \frac{16}{7}\frac{\Delta \tau }{{M}_{0}}\right) }^\frac{1}{3}\right] }^{2}}, \end{aligned}$$
(6)
$$\begin{aligned} & PGV=2.9\sqrt{{M}_{0}\Delta \tau }\frac{{\beta }_{V}}{R\sqrt{\frac{1}{k{C}_{S}}{\left( \frac{7}{16}\frac{{M}_{0}}{\Delta \tau }\right) }^{1/3}+R/{C}_{S}}{\left[ 1+{\pi }^\frac{4}{3}{\kappa }_{0}k{C}_{S}{\left( \frac{16}{7}\frac{\Delta \tau }{{M}_{0}}\right) }^\frac{1}{3}\right] }^\frac{3}{2}}, \end{aligned}$$
(7)

where \({\beta }_{V}=\frac{2\pi {U}_{\phi \theta }Fs\sqrt{\frac{16}{7}} {\left( k{C}_{S}\right) }^\frac{3}{2}}{(\sqrt{2\pi }4\rho {C}_{S}^{3})}\) and \({\beta }_{A}=\frac{4\pi {U}_{\phi \theta }Fs{\left( \frac{16}{7}\right) }^{2/3}{\left( k{C}_{S}\right) }^{2}}{\left( \sqrt{\pi }4\rho {C}_{S}^{3}\right) }\).

A previous study2 showed that predictions from this GMM were found to be similar to predictions from empirical laws for that region. For EEW purposes, the location and magnitude estimations are merely an instrument to facilitate ground motion prediction at sites of interest. Hence, we define ground motions as the target function of the entire algorithm and focus on the comparison between predicted ground motion and those observed at conventional seismic stations.

Data

The DAS recordings used to test and demonstrate the EEW algorithms were acquired using a 66 km-long optical fiber trending north-south (Fig. 2a). The fiber was interrogated using a Prisma Photonics unit. Data were sampled at 1500 Hz and later down-sampled to 100 Hz using an anti-alias filter. The gauge length is set to 18.2 m, and the spatial sampling is set to 9.1 m. The fiber is buried in a shallow duct along a busy motorway. This fiber was not deployed for seismological data acquisition nor as EEW deployment, yet we repurpose the acquired data for that goal. In order to simulate real-time operations, we stream the pre-recorded DAS data as one-second-long data packets to the EEW algorithm described above.

We show three earthquakes56 in the main text (Fig. 2a & Table 2). Each earthquake has unique characteristics:

  1. 1.

    An \(\textrm{M} = 3.3\) earthquake originated relatively close, east of the fiber (landside), with a very wide BAZ range relative to the fiber (“in-network”9). The event most likely originated on the known Carmel-Tirza fault system57.

  2. 2.

    An \(\textrm{M} = 3.6\) earthquake originated at a large epicentral distance and has narrow BAZ range from fiber to epicenter (“off-network”9). The event occurred near the Sea of Galilee, on a branch of the Dead Sea fault system.

  3. 3.

    An \(\textrm{M} = 3.4\) shallow earthquake originated relatively close, west of the fiber (seaside), with a very wide BAZ range relative to the fiber (“in-network”). The tectonic and lithologic province of such shallow earthquakes in the Israeli continental shelf is uncertain and still under active research58.

Table 2 Earthquakes.

A fourth earthquake (\(\textrm{M} = 3.1\)) is shown in the Supplementary Information (Supplementary Figs. S3-S5) as it is similar to the \(\textrm{M} = 3.3\) event, albeit with a narrower BAZ range and larger epicentral distance.

Catalog locations and magnitudes are taken from the Geological Survey of Israel56. PGA and PGV values were calculated as the geometric mean of the two horizontal traces per station, for all available Geological Survey of Israel accelerometers located up to 100 km from the epicenter. To this end, accelerograms were integrated to velocity seismograms followed by a 1 Hz high-pass filter.

Results

We show three case studies of earthquakes: (1) an \(\textrm{M} = 3.3\) event close to the fiber from the east (Figs. 3 & 4), (2) an \(\textrm{M} = 3.6\) event farther from the fiber (Figs. 5 & 6), and 3) an \(\textrm{M} = 3.4\) event close to the fiber from the west (Figs. 7 & 8). A forth, \(\textrm{M} = 3.1\) event is shown in the Supplementary Figures S3S5. A movie depicting the real-time process is available in the Supplementary Information for all four earthquakes.

Fig. 3
figure 3

Complete EEW workflow for a close earthquake from the landside (\(\textrm{M} = 3.3\); 2024-03-23 18:35:56). (a) Strain rate data. Blue and yellow markers depict fiber segments where P or S was detected, respectively, while red markers denote unassigned and potentially erroneous picks. (b) Location score maps at different time steps. Dark colors represent highly scored areas. The green cross depicts the real-time updated location, and the star shows the catalog location. Blue shade and yellow markers depict fiber segments where P or S was detected, respectively. (c) Real-time estimated magnitude, averaged over all short segments. The shaded background depicts the standard deviation. (d) Difference between predicted and observed (accelerometers) \(\text {log}_{10}\)PGA averaged for all epicentral distances. The shaded background depicts the standard deviation.

Fig. 4
figure 4

Epicentral distance and Arms influence on magnitude estimation and ground motion prediction for a close earthquake from the landside (\(\textrm{M} = 3.3\); 2024-03-23 18:35:56). (a) The difference between real-time epicentral distance and catalog distance, for all short segments versus time. (b) Maximum acceleration RMS measured from P pick, for all short segments along the fiber. (c) Real-time estimated magnitude for all short segments. The green curve depicts the average, which is used for ground motion prediction. (d) Difference between predicted and observed \(\text {log}_{10}\)PGA averaged for all epicentral distances. The shaded background depicts the standard deviation (replica of Fig. 3d).

Fig. 5
figure 5

Complete EEW workflow for a far off-network earthquake (\(\textrm{M} = 3.6\); 2024-03-13 12:50:30). Panels same as Fig. 3.

Fig. 6
figure 6

Epicentral distance and Arms influence on magnitude estimation and ground motion prediction for a far off-network earthquake (\(\textrm{M} = 3.6\); 2024-03-13 12:50:30). Panels same as Fig. 4.

Fig. 7
figure 7

Complete EEW workflow for a close earthquake from the seaside (\(\textrm{M} = 3.4\); 2024-08-06 17:27:53). Panels same as Fig. 3.

Fig. 8
figure 8

Epicentral distance and Arms influence on magnitude estimation and ground motion prediction for a close earthquake from the seaside (\(\textrm{M} = 3.4\); 2024-08-06 17:27:53). Panels same as Fig. 4.

Figures 35 and 7 are overview of the complete real-time EEW procedure for each of the earthquakes. Panels (a) show the DAS strain-rate data as a function of distance along the fiber and time. It includes the real-time beamforming-based P (blue markers) and S (yellow markers) picks. Panels (b) are snapshots of epicenter location score maps (range 0–1), where dark colors represent high score, the green cross is the mean location considering location scores above 0.95, and the yellow star depicts the catalog epicentral location. Panels (c) plot the real-time magnitude estimation with the catalog magnitude for reference, and panels (d) show the residuals between predicted (via the GMM) and observed (on accelerometers) PGA. The standard deviation of PGA residuals, indicated by the shaded region, is calculated using all epicentral distances.

Figures 46 and 8 show more details of the same earthquakes, focusing on the short segments that are used in the magnitude estimation module. Panels (a) depict epicentral distance discrepancy between real-time estimated and catalog locations as a function of time, for each short segment. Panels (b) show the maximum ground acceleration RMS as a function of time for each short segment (vertical axis corresponds to distance along the fiber), starting shortly after P pick is available in a segment. The horizontal dashed lines are a reference zero RMS for each short segment. For each segment, the epicentral distance (panels a) and the maximum acceleration RMS (panels b) are used for magnitude estimation, per segment, shown in panels (c). The green curve in panels (c) is the average magnitude, which is used along with epicenter location estimation for ground motion prediction, shown in panels (d) (identical to panels d in Figs. 35 and 7).

The \(\textrm{M}=3.3\) event that is close to the fiber from the east (Fig. 3) demonstrates a preferable fiber-earthquake configuration for EEW (Fig. 3b). The epicenter is relatively close to the fiber’s northern end. The azimuthal coverage of the segments near the apex allows for a very fast constrain on the location, only 0.35 seconds after the first P was detected (Figs. 3a,b & 4a). Furthermore, the real-time beamforming method per segment is most effective in this case, since near the apex, the apparent slowness is minimal and required moveout times are minimal, allowing for constructive delay-and-sum beamforming without waiting for extra data packets. Since the location is constrained very early, the small time-dependent variation in the magnitude estimation (Fig. 3c) is solely the result of acceleration RMS variation (Fig. 4b), which occurs as more seismic energy is recorded over time and additional short segments are added (black squares in Fig. 3a). Since magnitude is initially underestimated (Figs. 3c & 4c), PGA prediction is also, on average, initially underestimated (Fig. 3d). Even so, PGA underestimation in the first few seconds is rather small, less than half an order of magnitude on average. For more detailed ground motion prediction (PGV and PGA) see Supplementary Figure S6.

The \(\textrm{M}=3.6\) event is farther away from the fiber and with narrow azimuthal coverage (Fig. 5). This is the least preferable fiber-earthquake configuration for EEW. The initial location is unstable and is not constrained in terms of distance (Figs. 5a,b &  6a). The location error becomes small and stable only after a few S-P rings are added (Fig. 5b). The good initial magnitude estimation (Fig. 5c) should be considered unreliable until the location uncertainty is reduced and stable. After that, the deviation of predicted PGA from observed PGA follows the magnitude estimation trend, i.e., initial underestimation, which improves as more information flows in (Fig. 5d). For more detailed ground motion prediction (PGV and PGA) see Supplementary Figure S7.

The epicenter of the \(\textrm{M}=3.4\) event is close to the northern end of the fiber. P-phase strain-rate was registered weakly and incoherently on the fiber (Fig. 7a, blue markers). While P-phase picks are somewhat inaccurate, they generally draw a reasonable moveout across the fiber. However, most of the resultant beam stack (Fig. 7b at \(t=3.74\) s), points north instead of north-west (true BAZ to epicenter), as if the event is out-of-network, putting the real-time location outside the extent of the location score map and with large uncertainty (black contour in Fig. 7b). S-phase picking is also incomplete and inaccurate (Fig. 7a yellow markers), yet, in combination with the few highly weighted beams, the S-P rings add enough information to constrain the location at \(t=4.74\) seconds (Figs. 7b & 8a). This major update in location, where the real-time epicenter is much closer to the fiber, causes the magnitude to decrease (Figs. 7c & 8c). We note that due to the sparse picking, only three out of five short segments (Fig. 8b) contribute to the magnitude estimation (Fig. 8c). Furthermore, the magnitude estimation is delayed compared to the first P picks, because it takes more time to associate the sparse P picks. Despite a reasonably good location and magnitude estimations, the predicted PGA is overestimated (Fig. 7d). For more detailed ground motion prediction (PGV and PGA) see Supplementary Figure S8.

Each of the pseudo real-time playbacks shown above was performed under realistic computation time for real-time operations. Namely, each 1 second long data packet was processed through all the modules in less than 1 s. The detection-location module is the most computationally demanding one. After accelerating this computation with a GPU (NVIDIA A100, using PyTorch and CuPy Python packages), it takes about 80 % (0.8 s) of a data packet’s computation time. These 0.8 s are the baseline average computation time when an earthquake is not detected. When a detection occurs, additional time, always less than the remaining 0.2 s, is required for the other modules to run, depending on the number of picked segments.

Discussion

Our study successfully demonstrates the viability and performance of a physics-based DAS earthquake early warning system, demonstrated using several DAS earthquake records. The system performed excellently in events which were recorded under favorable conditions (\(\textrm{M} = 3.3\) Figs. 3 &  4 and \(\textrm{M} = 3.1\) Supplementary Figures S3S5), i.e., coherent high SNR for both P- and S-phases, and good azimuthal coverage of the fiber with respect to the epicenter. These conditions are expected for large earthquakes, relevant for EEW. The system rapidly outputs ground motion predictions that only slightly deviate (underestimation) from those recorded by accelerometers. These predictions converge to a negligible deviation in a predictable manner, primarily as additional waveforms are recorded at each time step. We also showed that the system performed reasonably well for an event with unfavorable conditions (\(\textrm{M} = 3.4\) Figs. 7 &  8). Despite the low SNR, low P-phase coherency, and uncharted tectonic regime of the very shallow source, the system was able to locate the event and estimate its magnitude. The ground motion prediction was reasonably good as well, albeit slightly overestimated, possibly due to a lower stress drop for this event compared to that of other sources in the region, an issue that requires further investigation.

The evaluation of the system’s performance over a long continuous data period is left for future work. Nevertheless, we provide minimal examples of continuous data in Supplementary Figures S1 (3 h prior to the \(\textrm{M} = 3.6\) event) and S2 (with no known event) where no false detections occurred. These examples demonstrate the robustness of the detection-location module. The robustness stems from several successive conditions that need to be met before an event is declared, both at the single sub-array level (one fiber’s segment) and at the network level (the entire fiber). Owing to the dense spatial measurements, the used sub-arrays are more densely spaced compared to traditional point sensors. As a result, the restrictive conditions at the network level (e.g., waiting for at least 7 associated segments instead of 4 stations in point sensor systems59) do not introduce latencies compared to point sensor systems.

Unlike the channel-wise approach8,35, a unique feature of our system is the segmentation of the fiber. Specifically, the use of dual sets of fiber segments, each serving a distinct purpose. Long fiber segments of arbitrary geometry are utilized for beamforming in the detection-location module to focus on the direct seismic phases that are expected to exhibit long coherency lengths. Concurrently, this ensures that we filter out scattered waves with short coherency lengths, whose beam will not necessarily point to the earthquake source. Another layer of filtering of scattered waves is implemented by limiting the maximum slowness in the beamforming, focusing on faster waves. For the detection-location module, complex fiber geometry is preferred as the array response is less ambiguous than that of a linear array, yielding more focused backazimuth and mean slowness estimates. Short linear segments are utilized in the magnitude estimation module to convert strain rate to acceleration. We use linear segments because the apparent slowness used for the conversion is well-defined only for a linear array. We use short segments because, unlike in the detection-location module, here we wish to accommodate both direct and scattered waves (with short coherency lengths), as both carry energy from the source. Properly accounting for both direct and scattered waves is crucial for accurate magnitude estimation.

The fiber used in this study was not deployed for seismological purposes, i.e., its location and geometry are not optimized relative to expected earthquake locations. Hence, we chose to maximize the number of long segments used for location, thereby optimizing proximity and azimuthal coverage relative to the expected sources. To comply with the real-time computation time of each data packet, we compromised on the number of short segments used for magnitude estimation. Despite the good results we obtained with a few short segments, we advise that in future implementation, preferably using a more suitable fiber geometry, to use additional short segments in order to increase magnitude estimation robustness and stability. This can be achieved by using fewer long segments for location, decreasing the total number of channels by down-sampling, or adding computing power.

Accurate location estimation serves two purposes: to estimate the magnitude, and to predict ground motion at target locations. Since the former affects the latter, it is useful to quantify the sensitivity of magnitude to location errors. We found that magnitude estimation is not very sensitive to location errors, as evident by the reasonable initial magnitude estimation in Fig. 8c, despite the relatively large initial location error (Fig. 8a). Figure 9 shows the magnitude estimation error (as deviation from the final real-time magnitude estimation) for the \(\textrm{M} = 3.3\) 2024-03-23 18:35:56 assuming different fixed earthquake locations, and using the final acceleration rms values ( Fig. 4b). As expected, locations that are closer to the fiber than the real epicenter yield lower magnitude (blue colors) and vice versa (green to yellow colors). Yet, overall, magnitude estimation errors are not larger than 0.75 magnitude units in the extreme cases (hundreds of kilometers from the fiber). The same sensitivity analysis for the other earthquakes is shown in Supplementary Figures S9S11.

Fig. 9
figure 9

Magnitude estimate sensitivity to location error (assuming hypocentral depth of 10 km) for the \(\textrm{M} = 3.3\) 2024-03-23 18:35:56 earthquake. The black contours correspond to the background colormap values. At the presented distances, the magnitude is only slightly sensitive to location errors.

A common practice in conventional EEW systems is to utilize empirical relations between magnitude and peak ground displacement in the first few seconds following P-wave arrival60 for magnitude estimation. This approach, despite being empirical, is rooted in basic seismological theory, which directly relates the far field displacement low-frequency spectral plateau to the seismic moment (which is proportional to the magnitude)61. Since velocity and acceleration are analytically related to displacement, other ground motion-based scaling relations with magnitude are possible, e.g., peak velocity (Pv)62, peak acceleration (Pa)63, and squared velocity integral (IV2)64, all measured in the first seconds following P-wave arrival. These scaling relations between peak ground motions and magnitude have also been theoretically developed39,47,65,66. Since strain-rate and acceleration are theoretically related through the apparent slowness52, and since acceleration is, in turn, theoretically related to the seismic moment39,47,,6566, an empirical relation between strain rate and magnitude 36,67 can be assigned with physical interpretation. The global and channel-specific constants in the empirical approach to magnitude estimation that uses DAS strain-rate36 can be interpreted to include an implied strain-rate to ground motion conversion (Eq. 2) using a constant or channel-specific slowness value68.

Our magnitude estimation approach is more complex and computationally demanding compared to empirical real-time magnitude estimation36. The complexity stems from the need to convert strain rate to acceleration in real time rather than assuming a constant apparent slowness68. However, using a predefined constant slowness is disadvantageous for several reasons. First, it requires lengthy calibration using earthquake recordings, hindering fast implementation of the system. Second, it wrongly assumes the same slowness for direct and scattered waves69 and ignores the apparent slowness dependency on the array geometry and the wavefront propagation direction. Third, average slowness is ill-defined: for the same fiber channel, the signal of a large earthquake with a long source duration will be dominated by fast direct waves, while the signal of a small to medium-sized earthquake with a much shorter source duration will be dominated by slow scattered waves. The latter issue is aggravated since DAS recordings of large earthquakes are rare. To further quantify this issue, we compared the performance of our magnitude estimation and ground motion prediction with that achieved when strain rate is converted to acceleration using different constant slowness. All other processing is identical. Figure 10 shows the magnitude estimation error (Fig. 10a) and the ground motion prediction error (Fig. 10b) variations using a constant apparent slowness value (solid line) and time-dependent apparent slowness (dashed line), as implemented in our system. The magnitude and PGA values are calculated using the final values of the acceleration rms and the catalog epicentral distance (The same sensitivity analysis for the other earthquakes is shown in Supplementary Figures S12S14). The magnitude estimation, and consequently the ground motion prediction, show relatively high sensitivity to the apparent slowness used for conversion (compared to the magnitude sensitivity to location error). This result demonstrates that the physics-based direct conversion of strain rate to ground motion is more robust for magnitude estimation and does not require any previous knowledge or calibration.

While a direct comparison to empirical DAS-based magnitude estimation36 may be beneficial, such a relation does not exist for local DAS data, and its derivation is a subject for future work. Previous studies8,35 showed that such a relation needs to be recalibrated to properly apply to a new region, making previously obtained relations inapplicable here. We emphasize that we compare our ground motion predictions to actual peak ground accelerations from nearby accelerometers, constituting the more robust validation metric.

Fig. 10
figure 10

Magnitude estimate and ground motion prediction sensitivity to constant and time-dependent apparent slowness for the \(\textrm{M} = 3.3\) 2024-03-23 18:35:56 earthquake. (a) Magnitude estimation deviation from the catalog magnitude as a function of the constant slowness used in strain-rate to acceleration conversion (solid curve). The deviation using time-dependent slowness is shown for reference (dashed horizontal line). (b) PGA prediction and observation residuals as a function of the constant slowness used in strain-rate to acceleration conversion (solid curve). The residuals using time-dependent slowness are shown for reference (dashed horizontal line).

The core functionality of our system relies on physical models, including plane wavefront propagation and coherency for detection-location, a theoretical ground motion model for magnitude estimation and ground motion prediction, and a pseudo-analytical strain-rate-to-ground motion conversion. This foundation is expected to require minimal calibration when deploying the system using a new fiber. Essentially, only the fiber geometry and choice of short fiber segments need to be considered. Thus, the presented DAS EEW system can be implemented to a new fiber with minimal parameter tuning in the detection-location module, and before enough data has been acquired to retrain empirical approaches. We thus conclude that our approach is general and especially appealing for fast EEW deployments.