Introduction

The 12 km x 15 km Campi Flegrei caldera (Fig. 1) arguably presents the highest volcanic risk on Earth, with nearly 500,000 people exposed to the threat of immediate evacuation in case of an impending eruption1 (https://www.regione.campania.it/it/printable/rischio-vulcanico-campi-flegrei). It has been active since at least 110 ka2. The VEI 7, Magnitude 7.8 Campanian Ignimbrite eruption, occurred at 40 ka, and represents the largest caldera-forming event3. After a long period of deflation following the historical last eruption in 15384, unrest episodes characterized by ground uplift and seismicity occurred in the 20th century in 1950–51, 1969–72, and 1982–845. The most recent and ongoing period of unrest started in 2005, and it is characterized by a progressive increase in ground uplift, a rising number of earthquakes with growing maximum magnitude, and an escalation of the seismic energy released, as well as geochemical anomalies5,6,7,8,9,10,11,12,13,14,15,16. For these reasons, since 2012, the alert level at Campi Flegrei has been raised to yellow, the second level of four, denoting conditions of mild unrest above background levels17. During 2023 and 2024, and through 2025, the number of earthquakes felt by the population has progressively increased, with a maximum duration-magnitude, Md = 4.6, recorded on March 13, 2025, the largest on record since the existence of a seismic monitoring network installed at Campi Flegrei in 1970. These earthquakes caused limited damage to buildings, but created severe public concerns, not least because an increasing number of scientific works are now suggesting the direct involvement of magma at relatively shallow levels (5–8 km depth) associated with the unrest11,13,14. The ground uplift shows an overall symmetric bell-shape and has reached in May 2025 a total maximum of more than 1.46 m at the center of the caldera since the beginning of this last unrest in 200518.

Fig. 1: Location of the restless Campi Flegrei caldera and its neighboring active volcanoes of Vesuvius and Ischia.
Fig. 1: Location of the restless Campi Flegrei caldera and its neighboring active volcanoes of Vesuvius and Ischia.The alternative text for this image may have been generated using AI.
Full size image

The date of the last eruption for each volcano is indicated in italic. The hillshade of the terrain is overlain by the NASA population density (the intensity of the red indicates up to 30,000 persons/km2; https://neo.gsfc.nasa.gov/view.php?datasetId=SEDAC_POP) to highlight the high risk of the area. The INGV-OV seismic network (large yellow dots for the Campi Flegrei and small yellow dots for Ischia and Vesuvius seismic stations), both onshore and offshore, allows very precise localization of earthquakes, also very small in magnitude (Mdmin = −1.6; see text). The simplified outlines of the Campi Flegrei nested calderas (red dashed lines) are taken from Vitale and Isaia (2014). To better highlight the risk at Campi Flegrei caldera, the figure shows: the area of current highest seismic risk (the inner blue line) as the area with minimum uplift of 10 cm since 2005 and that encloses all epicenters of earthquakes Md > 2 (https://www.protezionecivile.gov.it/it/normativa/decreto-legge-n-140-del-12-ottobre-2023-misure-urgenti-di-prevenzione-del-rischio-sismico-connesso-al-fenomeno-bradisismico-nellarea-dei-campi-flegrei/); the red zone (the outer solid dark red line), i.e. the area identified by the current Emergency Plan to be fully evacuated in case of an impending eruption (https://mappe.protezionecivile.gov.it/it/mappe-e-dashboards-rischi/piano-nazionale-campi-flegrei/).

Where multiparametric monitoring networks are in place, the failure forecast method (FFM) has been proposed as an assessment of the likelihood of the progression of an unrest episode toward a volcanic eruption. The FFM relies on acceleration of monitored parameters, such as seismicity, ground deformation, and geochemical anomalies, to indicate the potential ‘critical’ conditions at which magma pressure can induce a dyke to propagate through the country rocks19,20,21,22,23,24.

Kilburn (2018)23 proposed the mutual dependence of ground deformation and volcanotectonic (VT) seismicity as the expression, respectively, of the elastic versus inelastic response of the crust to magma/fluid pressurization, where VT earthquakes represent rock fracturing beyond the limit of its elastic deformation. Accelerations of monitored parameters are commonly evaluated upon variations of scalar quantities, e.g., the number of earthquakes per day and/or the absolute ground uplift at a specific monitoring site (e.g., ref. 25). The choice of tracking the trends of scalar quantities is reasonable for seismic activity, particularly in the early stages of volcanic unrest episodes. During these stages, seismicity tends to be distributed within rock volumes occupied by the pressurized hydrothermal system, typically at depths ≤2–3 km. It is dominated by scattered swarms of small magnitude VT earthquakes (commonly M ≤ 3), associated with diffuse microfracturing26,27,28,29. Seismic swarms differ from seismic sequences in that they do not follow or precede a mainshock30. In volcanic environments, they are not associated with extensive fault planes and are commonly attributed to rock microfracturing resulting from variations in pore pressure, similar to fluid-induced seismicity (e.g. refs. 31,32), during episodes of pressurizations of the hydrothermal systems, possibly in relation to magma migration and degassing33,34.

FFM approaches have been applied to Campi Flegrei in a series of recent works9,25,35,36,37. Bevilacqua et al. (2020)37 analyzed probabilistically the evolution of the earthquake occurrence and, by applying the FFM, forecasted the reaching of a critical state of the caldera in 2023 at the 50th percentile of probability and in 2031 at the 95th percentile. Kilburn et al. (2023)36, based on the variations in the relative rates of earthquake occurrence and ground uplift, suggest that since 2020, the caldera has entered an inelastic regime of deformation, that is, when rock brittle failure and permanent deformation begin to prevail over the elastic response to pressurization. Bevilacqua et al. (2024)25 suggest that the increasing exponential relationship between ground deformation and cumulative number of earthquakes is indicative of progressive mechanical weakening of the stressed caldera rocks. However, FFM approaches do not consider the spatial distribution of the investigated parameters, and so they lack crucial information on where pre-failure strain is localizing and thus on where magma eruption or large earthquakes may occur. At Campi Flegrei, for instance, it remains uncertain if and where a fault or a series of faults accommodating the inelastic behavior and the mechanical weakening have nucleated or reactivated38. The identification of faults from clustering of hypocenters has been attempted by Scotto di Uccio et al. (2024)39 by comparison with documented superficial expression of preexisting volcano-tectonic lineaments. However, while showing some possible overlaps, this overlay approach remains qualitative, requires an expert evaluation, and it is time-independent, hence it does not provide constraints to the FFM. By contrast, the distribution of the Energy Space Density of Campi Flegrei earthquakes from 2000 to 2023 shows, since 2023, a significant enlargement of the volume of fracturing rocks at ~3 km depth, approximately at the center of the caldera, which is the same region where the maximum uplift occurs10,40. This enlargement strictly depends both on the exponential increase of the number of earthquakes that occurred in 2023 and 2024 (Fig. 2), and on their relative energy released. While clearly indicating an increase in the inelastic response of the caldera rocks, an enlargement of the volume of fracturing rocks does not discriminate between diffuse rock microfracturing and spatio-temporal clustering along a new and/or reactivated volcano-tectonic fault or fault system.

Fig. 2: Areal distribution of epicenters of earthquakes registered by the INGV-OV seismic network in 2019–2024.
Fig. 2: Areal distribution of epicenters of earthquakes registered by the INGV-OV seismic network in 2019–2024.The alternative text for this image may have been generated using AI.
Full size image

Each panel also shows the projection of hypocenters across N–S and E–W cross sections (data source, INGV-Osservatorio Vesuviano, 2024b).

In this study, we investigate the three-dimensional spatial and temporal clustering of earthquake hypocenters associated with seismic events recorded between 2018 and 2024 (time of manuscript preparation) by the monitoring network of INGV-Osservatorio Vesuviano (2024b)41. The objective is to determine whether, when, and where seismicity has transitioned from diffuse microfracturing to localization along a well-defined fault plane. We use an innovative Monte Carlo approach designed to overcome the limitations in spatial clustering associated with the diffuse distribution of hypocenters within a relatively small and shallow crustal volume. Our findings indicate that, since 2023, onshore seismicity has evolved from a volumetrically dispersed pattern to a more localized distribution along a planar structure. This clustering persisted on the same planar structure in 2024 with a strike of N249° ± 4°, and a dip of 53° ± 1° (i.e., toward the NW), located near the center of the caldera. The observed spatial and temporal evolution of seismicity is analyzed in the context of ongoing ground uplift, considering two possible scenarios: (i) the nucleation of a new volcanotectonic fault or (ii) the reactivation and unclamping of a pre-existing fault. In both cases, the implications for volcanic and related hazards assessment are significant.

Materials

We utilized the seismic catalog of Campi Flegrei, compiled by the Istituto Nazionale di Geofisica e Vulcanologia—Osservatorio Vesuviano42. This official and publicly accessible catalog (https://terremoti.ov.ingv.it/gossip/flegrei/) provides the origin time, location, along with its uncertainties, and duration-magnitude of each earthquake (compiled data can be accessed at the link provided in the Data Availability section). The events were located by using manual pickings and a layered velocity model as described by Tramelli et al. (2022)43. The catalog encompasses seismic events recorded between January 1, 2019, and October 10, 2024 (time of manuscript preparation), with depths ranging from 0.1 to 5.6 km and duration magnitudes (Md) between −1.6 and 4.4. The average location errors are 0.2 km for both horizontal and vertical positions. As the location of micro-earthquakes is affected by large uncertainties, we selected earthquakes with Md ≥ 0.5, following the completeness magnitude determined by Del Pezzo and Bianco (2024)10 for the period 2000–2023. The well-distributed geometry of the seismic monitoring network, which also includes four offshore stations (Fig. 1), ensures a comparable level of earthquake location accuracy across the various sectors of the caldera, at least for seismic events with Md ≥ 0.5.

We chose to use the catalog provided by the Istituto Nazionale di Geofisica e Vulcanologia—Osservatorio Vesuviano because it is an institutional product, continuously updated and validated by the official monitoring authority of the area. Its certified and standardized nature ensures the consistency, transparency, and reproducibility of our analysis over time, which we consider essential for the scope of our work. We acknowledge that relocated catalogs (e.g., refs. 12,39,43), obtained through relative location techniques such as double-difference methods or source-specific corrections, can offer improved clustering and lower location uncertainties, which can facilitate the identification of seismicity patterns. However, such catalogs are often not available continuously or systematically over the entire time window of interest and may introduce a bias toward artificially increased clustering due to the nature of the relocation algorithm itself.

The goal of our study is to offer a statistical, unsupervised approach to identify temporal clustering of earthquakes, highlighting the potential activation of main fault structures. Several authors have previously mapped multiple fractures and faults in the area using both seismic and geological data; an overview of these structures is summarized in Iervolino et al. (2024)44. In that same work, the authors define the upper bound of reference magnitude scenarios as the simultaneous activation of multiple faults. In a similar spirit, our method aims not just to detect individual small-scale faults, but their possible merging or interaction over time, and for this reason, we preferred to use a catalog that does not implicitly cluster events through a double-difference approach.

To ensure the reliability of the seismic locations used in the analysis, we applied a completeness magnitude threshold of 0.5, which allowed us to select events for which 95% of the catalog has a location quality of B or better. This quality code incorporates several factors, including the root mean square (RMS) residuals, horizontal and vertical errors, the number of phase picks, and the azimuthal gap.

Approach

Seismicity at Campi Flegrei is mostly concentrated onshore in the center of the caldera, but over time it has also progressively extended offshore, where it is less frequent and deeper, defining an overall annular seismic volume, likely along ring-fractures (Fig. 2; e.g., refs. 38,39). We focused our analysis on the most seismic and at-risk onshore area, where the number of earthquakes ensures a statistically significant analysis, to assess whether earthquake hypocenters may show evidence of clustering over time and space. The 3D spatial analysis over time presents several challenges. First, most events have magnitudes close to 0 or negative, for which the location uncertainty is too high for our application. Our analysis is limited to events with Md ≥ 0.5, for which the average uncertainty on both horizontal and vertical localization is around 200 m10.

Second, the spatial distribution of hypocenters is highly inhomogeneous, with the horizontal distribution of event locations larger (about 6 km x 6 km) than the depth range (mostly concentrated at <3 km with fewer offshore extending down to 5.5 km). This type of spatial distribution introduces an unsolvable bias for any objective trend analysis of the totality of data, which would inevitably result in flattening any possible searched planar trend. To override this bias, in this study, we used a multiple inversion approach to analyze data falling into a hexagonally centered lattice of vertical cylinder cells of a given radius and a grid spacing providing the minimal cell overlap to guarantee the analysis of all data. The cylinders have a base diameter of 1000 m with a grid spacing of 866 m (Fig. 3). These dimensions optimize the need to have a vertically extended computational domain (the ratio of vertical:horizontal extents is ~5:1 for each cylinder) enclosing at the same time a statistically significant number of events (see the “Methods” section).

Fig. 3: Planar view of the lattice of analyses used in this study.
Fig. 3: Planar view of the lattice of analyses used in this study.The alternative text for this image may have been generated using AI.
Full size image

Red points identify the lattice cells; the Red circle indicates the cylindrical areas; blue dots represent the earthquake locations. The base of the three cylinders (6_3, 5_4 and 6_4) with close orientations of planes identified in 2023 and 2024 (indicated) are highlighted in red, to which the adjacent cylinder 7_3 was added (in orange) to define the merged volume for further MCC analysis in 2023 and 2024. The same was also performed on the cylinder with a base circle in blue (see text for explanation).

Third, spatial analysis commonly uses the least squares method (LSM) to identify the trend that best fits the data, which in our case should represent the potential planar structure to which hypocenters may be associated. However, LSM assumes that all data belong to the same population, enhancing the relevance of data with values falling far from the predicted model, measured by the square of their distance from the predicted distribution. Seismic swarms in calderas may instead be associated with variably overlapping causes (e.g., gas pressure in the hydrothermal system, magma pressure in geometrically complex reservoirs, strain localization along different pre-existing volcano-tectonic faults, and across variable rock rheology). This limitation has been reduced in this study by using a different factor for the distance45 as its root square, the least root square (LRS) method to automatically reduce the influence of distant data (e.g. as outliers), yet without imposing any arbitrary threshold. This method proves to be much more efficient in identifying clustering within dispersed datasets respect to LSM approach (see the “Methods” section for explanation, where the performance of various exponents L [3, 2, 1, 0.5, 0.333] has been tested on the 2023 data and confirm the best performance for the used exponent L = 0.5 to weigh the data distance from proposed planes). This implies the use of a specific Monte Carlo approach (in this study, we used a Monte Carlo-Like Convergent Method (MCC, see the “Methods” section46). Data for each cylinder were processed with MCC to obtain the best planar solutions within those volumes. To test the reliability of the solutions, we have performed a random data sampling analysis for the diffusive years, confirming, as expected, that the variance is independent of the number of samples as the total number is statistically significant (see Supplementary information, Table 1a). To test the influence of the lattice position in the trend identifications, the grid analysis was repeated by shifting the lattice by a random value and proved congruent results (Supplementary information, Table 1b). The minimum number of data to define a potential plane was set to eight, which provides the minimum acceptable degree of freedom (5 for this planar approach) in a three-dimensional space to obtain a reliable planar solution.

If identified planes of adjacent cells show close attitudes (i.e. planes have similar strike and dip, defined as planes falling inside the same quadrant, i.e. with the poles to planes at angular distances <45°), we verified the effective presence of a unique planar structure extending across the lattice, by performing a new MCC inversion analysis within the merged volume of the related cylinders (Fig. 3 and Supplementary information, Table 2a). To avoid any bias imposed by the potentially elongated geometry of the merged volume, the analysis was also performed on a single larger cylindrical volume centered around the merged cylinders (Fig. 3 and Supplementary information, Table 2b). The MCC analysis was then repeated several times (the original set of random values guarantees always slightly different results within the satisfactory resolution) to test the robustness of the solution and its associated error by computing the mean and the standard deviation of the set of results allowed us to identify the presence of a consistent planar trend.

This multiple analysis procedure was also repeated by adding to the earthquake locations their error. This was performed by adding a vertical and horizontal shift to each data chosen randomly within the associated error of the measure. Results of this analysis show only a modest increase in the data scattering along the identified planar trends, confirming the reliability of the solutions (Supplementary information, Tables 2c1 and 2c2).

Results

The results of the MCC inversions of the distribution of earthquakes' hypocenters have identified best-fit planes that vary both in space and time. The best-fit planes obtained within each individual cylindrical volume for each analyzed year are shown in Fig. 4.

Fig. 4: Temporal evolution of best-fit planes clustering earthquakes' hypocenters.
Fig. 4: Temporal evolution of best-fit planes clustering earthquakes' hypocenters.The alternative text for this image may have been generated using AI.
Full size image

a Perspective views and b stereonet representation (southern hemisphere projection) of planes and poles of the planar features clustering earthquakes hypocenters in the years 2019–2024. Since 2023, a planar structure well-defined by hypocenter clustering appears in the center of the caldera (see text for explanation).

In 2019, 2020, 2021, and 2022, the total number of located events increased from 536 to 797, 1157 and 1580 (Fig. 2). The number of events with Md ≥ 0.5 increased from 119 in 2019 to 193 in 2020, 264 in 2021, and 443 in 2022. Significant planar trends (i.e. associated with a minimum of 8 events) were identified only in 6 cylinders, in 2019, in 5 cylinders in 2020, in 8 cylinders in 2021, and 10 cylinders in 2022, all located in the area of maximum seismicity below Pozzuoli-Solfatara (Fig. 4a). These best-fit planes are associated with a small number of events (with a maximum of 51 in 2022; Supplementary information, Table 1b) and show completely scattered orientations (Fig. 4b), indicating a random distribution of earthquakes in the seismogenic volume and poor reliability of the best-fit planes.

With the sharp increase in seismicity observed in 2023 and 2024 (Fig. 2), the identified best-fit planes increased in number and spatial distribution, as well as in clustering progressively larger numbers of earthquakes.

In 2023, a total number of 3448 events occurred, with 1181 Md ≥ 0.5 (Fig. 2), which allowed the identification of 26 cylinders with significant best-fit planes. Most of the best-fit planes are in the correspondence of Pozzuoli-Solfatara but also extending in the area of Bagnoli (Fig. 4a). These planes cluster up to 143 events (Supplementary information, Table 1b). An enlargement of the identified best-fit planes off-shore can also be appreciated in the Gulf of Pozzuoli (Fig. 4a). In 2023, the area of most intense seismicity Pozzuoli-Solfatara shows clustering of events around coherent NNW dipping planes (Fig. 4b) in three contiguous cylinders, with best-fit planes oriented at a dihedral angular distance <28.5° (Supplementary information, Table 3).

In 2024, the number of earthquakes until October (time of analysis) further increased to 4456 (Fig. 2) with 1528 Md ≥ 0.5. This allowed the identification of 30 cylinders with best-fit planes that occupy a vast portion of the caldera, both around the zone of maximum seismicity Pozzuoli-Solfatara, in the Bagnoli area, and offshore (Fig. 4a). The largest number of earthquakes occur within the same three adjacent cylinders identified for the first time in 2023 (Fig. 4b, Supplementary information, Table 1b) and extended to a fourth adjacent cylinder, with close NNW dipping orientations at dihedral angular distance <36.8° (Supplementary information, Table 3).

We selected these four cylinders identified as 6_3, 6_4, 5_4, and 7_3 (Fig. 3) to test the reliability of the results (Supplementary information, Tables 2a, 3), by running the MCC analysis in the merged volume of the four adjacent cylinders. The surface area of the four merged cylinders is about 3 km × 2 km (Fig. 3). To test the influence of the volume geometry, MCC analyses were also performed within a 3 km diameter cylinder centered on the merged volume, providing very close results (Fig. 3; Supplementary information, Tables 2b, 3)

Results confirm that in 2023 and 2024 the identified planes in the extended merged volume coincide with those identified within adjacent cylinders and provide orientations of N247°, 53° in 2023 and N244°, 53° in 2024 (Supplementary information, Table 2a). The total number of events that are associated with the clustering plane in the first ten months of 2024 is 401 (Supplementary information, Table 2a). Very similar results are obtained within the 3 km diameter cylinder centered on the merged volume (Supplementary information, Table 2b). The persistence of a very similar direction for the best-fit planes in 2023 and 2024 suggests that a change has occurred in the pattern of event location since 2023, which we will discuss as the nucleation of a potential incipient volcanotectonic fault (PIF) or the activation of a pre-existing volcanotectonic fault.

A robustness-like analysis was performed by repeating 10 times the MCC inversion for all 2023–2024 events located in the merged volume (hereafter the PIF volume), and by adding a random error to event locations within the declared measure error for each event location. Results are presented in Fig. 5a and Supplementary information (Table 2c1) and show the strong reliability of the found PIF solution within the merged volume, with a mean value in 2024 of N249° ± 4°, 53° ± 1°. Very similar results are obtained within the 3 km diameter cylinder centered on the merged volume (Supplementary information, Table 2c2). By recalculating the best-fit plane within the PIF merged volume since 2019 (Supplementary information, Tables 2c1,2c2), it is possible to confirm that from 2019 to 2021 the orientation is random, and that in 2022 there is the first mild clustering around the PIF orientation that then dominates in 2023 and 2024 (Fig. 5b). Alongside, the standard deviation both on strike and dip decreases substantially, demonstrating the effect of clustering over time (Fig. 5b). It must be also noted that the dip angle increases over time, because of the identification of the PIF plane, progressive prevailing over the distributed volumetric small events which tend to bias toward subhorizontal or low angle solutions (e.g. 2019, 2020, 2021). By imposing the PIF orientation in the PIF merged volume to the distribution of events since 2019 it is also possible to appreciate both the increase in the number of events clustered around the PIF and the progressive decrease of the root mean power (RMP, that represents the mean residual value, computed with the used exponent L = 0.5) associated with earthquakes at distances <200 m from the PIF (Fig. 5c).

Fig. 5: Error analysis.
Fig. 5: Error analysis.The alternative text for this image may have been generated using AI.
Full size image

a Contouring (Schmidt stereonet, lower hemisphere) of the of poles to the PIF planes identified in the merged PIF volume (cylinders 6_3, 6_4, 5_4 and 7_3, see Fig. 3) by a set of 10 repeated MCC analyses with the random error added to earthquake locations, showing the consistency of the MCC processing and the limited influence of the location error. b Plane features calculated within the PIF volume since 2019, showing a scattering of dip and azimuth until 2023 and 2024; colors of squares indicate the number of events associated with the plane (pale blue <250; red >500); error bars indicate 1 standard deviation. c Yellow diamonds indicate the progressive increase of number of events associated with the PIF since 2019 until October 2024; green stars indicate concomitant reduction in the ratio of the root mean power RMP (L = 0.5) of hypocenters with distance <200 m from the plane respect to the RMP distance of all hypocenters from the plane; the reduction of this ratio illustrates that the clustering around the PIF in 2023 and 2024 prevails over dispersion of earthquakes in the progressively larger seismic volume.

While the analysis presented in this paper stops in 2024, i.e., the time of submission of the manuscript, we have also analyzed, during revision, the seismicity that occurred in the first months of 2025, which is further increasing as the number of events and total energy released, and confirms the persistence of the same clustering plane identified in 2023 and 2024.

Discussion

Seismicity and deformation are well-documented indicators of the evolution of caldera unrest and of the localization of intra-caldera fracture systems (e.g. refs. 47,48,49,50). Very recent analyses of seismicity and ground deformation during the current ongoing unrest at Campi Flegrei caldera are progressively highlighting the marked and simultaneous changes in uplift velocity and seismicity rate that have occurred over the last few years. Bevilacqua et al. (2024)25 highlight the parabolic increase in ground uplift and a super-exponential rise of earthquake number and seismic energy released, interpreted as possibly related to a progressive mechanical weakening of the crust. Giudicepietro et al. (2025)15 showed that since 2021 the style of seismicity has changed with the development of burst-like swarms. Del Pezzo and Bianco (2024)10 show that since 2023, the energy space density of the Campi Flegrei seismicity highlights an enlarged volume of fractured rocks at a depth of around 3 km. At the same depths, Calò and Tramelli (2025)40 and Giacomuzzi et al. (2025)12 identify a rock volume characterized by high Vp/Vs ratios, possibly for the presence of pressurized fluids, qualitatively in agreement with electrical resistivity imaging51.

Very recent studies on geophysical and geochemical data now converge in addressing the current unrest at Campi Flegrei to the migration and accumulation of magma in the plumbing system (8–5 km depth), which, also by releasing volcanic fluids, pressurizes the shallow hydrothermal system (<4 km)11,13,14. While the interpretation of the exact mechanisms behind the observed deformational patterns is still open to some discussion (e.g., ref. 52), it remains unclear whether the system, as described by the FFM approach, is nearing or has already reached a critical state (e.g. ref. 36). Specifically, it is unclear whether the mechanical behavior of the rocks surrounding the caldera plumbing system is currently already in its final stage of failure or still progressing toward that point. Our approach moves beyond the classic FFM approach and allows us to define if and where the deforming caldera is showing evidence of a transition from diffuse microfracturing, typical of the early stages of unrest, to clustering and linkage of microfractures along a defined plane that may act as an individual fault or a fault system. By overcoming the main limitations to spatial clustering analysis of VT earthquakes hypocenters, our innovative unsupervised approach shows that since 2023 the seismicity shifted from randomly scattered within the most stressed volume of Campi Flegrei caldera, i.e. at its center, to self-organized around a main clustering plane with orientation N249° ± 4°, 53° ± 1° (Fig. 5a). This transition may have important implications for the assessment of the maximum expected earthquake magnitude (e.g. ref. 44), as well as because developing faults may unload the system and become preferred pathways for both hydrothermal fluids and magma uprise.

The clustering plane crosses the volume of maximum vertical deformation under Pozzuoli (Fig. 6a). With a horizontal extension of 2.8 km and a similar extension at depth, the plane dips inland and almost emerges at the Earth’s surface to the southeast of Pozzuoli (Fig. 6b; cf. refs. 38,39). Pozzuoli is the area of maximum axisymmetric strain (Fig. 6) and can be interpreted as indicating the localization and vertical orientation of the maximum stress (σ1). The attitude of the clustering plane (dip of 53°) and its position with respect to maximum uplift are consistent with an Andersonian fault53 (Fig. 6a), with a theoretical internal friction angle of 16°, not unrealistic considering the weak rock rheology of the shallow Campi Flegrei rocks (e.g. refs. 54,55).

Fig. 6: Location, extension, and kinematics of the potential incipient fault.
Fig. 6: Location, extension, and kinematics of the potential incipient fault.The alternative text for this image may have been generated using AI.
Full size image

a Cross-sectional view. b Planar view (red square net) of the PIF with respect to the axisymmetric ground uplift pattern centered at the RITE GNSS station (indicated as a yellow triangle along with the ACAE GNSS station). Hypocenters associated with the PIF in a are color-coded for duration magnitude from Md = 0.5 (light blue) to >3.5 (dark blue). All other hypocenters are gray in the background. Uplift isolines in b are in millimeters and show the uplift measured in 2024 (INGV-Osservatorio Vesuviano, 2024a). The dotted line in b indicates the trace of the cross-section shown in (a). The yellow surface in a represents the −3.84 iso-value envelope for the observed energy space density in 2023 (from Del Pezzo and Bianco 202410) and shows the concentration of the seismic energy along the PIF and over the central domal volume. Focal mechanisms in b show mainly extensional trends parallel to the PIF strike, which, associated with the PIF dip, suggests downthrow toward the NNW. This is perfectly coherent with the location of the area of geodetic anomaly (thick blue line from Giudicepietro et al. 2025)15 defined by a deficit in uplift rate.

Discriminating whether the identified clustering plane may represent the reactivation of a pre-existing fault or the nucleation of a new fault is not easy. Field studies on the distribution of faults and fractures in the Campi Flegrei caldera floor, both onshore14,44,51,56,57,58,59,60,61,62,63,64 and offshore 65,66 show many structural elements, mostly NE-SW and NW-SE oriented, that are associated with the episodes of volcanotectonic resurgence and subsidence since 15 ka63,67,68. However, field evidence does not clearly show the presence of a main fault in the position, and with orientation and extension similar to the plane that is currently clustering the seismicity. In addition, the high density and quality of the INGV seismic monitoring network that allows now to detect and track the clustering plane during the current unrest, was not available during the former unrests (1982–84; 1969–72; 1950–52), making it impossible to ascertain whether previous seismic activity was already clustering on a similar plane (cf. ref. 69).

Nevertheless, we note that the event distribution along the clustering plane is rather stochastic, suggesting that this plane is formed by the progressive linkage of micro-dislocation surfaces. By comparing the geometry of the clustering plane and the energy space distribution10 (Fig. 6a), we note that both localized and scattered seismicity occur in a volume that is centered on, but is larger than, the plane itself. Also, some of the mainshocks do not occur on the clustering plane (Fig. 6a), suggesting that damage and energy dissipation are not confined to the region where the linkage of micro-fractures is occurring, possibly involving also secondary planes (e.g. ref. 70). This can be attributed to the stress field’s vertical orientation. The progressive definition of a clustering plane is substantiated by the location of most earthquakes, including several of the largest magnitude ones. Consequently, this plane localizes the largest energy released (Fig. 6a), but also other structures can be activated.

We therefore prefer to interpret the change in spatial distribution of seismicity occurring since 2023 as the birth of a potential incipient fault (PIF). The adjective “potential” clarifies that we have so far identified a statistical, not a physical, fault plane. With the term “incipient”, we stress that we are not describing yet a fully developed classic seismic fault, but rather the initiated and ongoing process of self-organization of the micro-fracturing, which is in the course of defining a fault plane. This is important as the merging of small dislocation planes into a larger one results in a large increase in stress along its tips, thus producing their propagation and eventually the full development of a fault71, whose dimensions control the maximum expected magnitude on that plane (cf. ref. 44).

It is worth specifying that a fault is generally a complex structure that involves a volume (i.e., the fault zone), often with several displacement surfaces that at times may concentrate most of the slip, i.e., the fault core (e.g. ref. 72).

Two are the end-member models for fault generation. The first involves their nucleation from an initial relatively small zone where the acting stress overrides the brittle strength of the rock and then expands along its edges, eased by the stress concentration that generates at its tips73. This evolutionary model applies when regional (e.g., external) stresses are acting.

The second considers the initial development of a cloud of small fractures with limited displacement. As their dimensions increase due to increasing stress and, again, fault tip stress concentration, adjacent faults join and, if their sense of movement is compatible, form a larger fault with increasing displacement. Eventually, the linkage process ends with the development of a major fault surface73,74,75. This latter model seems to apply for the development of the observed PIF, where a progressive clustering of seismic events is observed along a preferential orientation, ascribable to a sort of creeping along minor faults76. The required coherence among the minor fault population relates to the mechanism for their origin. In the case of the Campi Flegrei unrest, the extensional regime produced by the deformation resulting from the observed uplift13. This may exclude an origin of the PIF as the effect of the increase of a regional (tectonic) stress and the PIF can be considered the effect of the kinematics of the resurgence acting in the Campi Flegrei caldera, allowing us to classify the PIF within the kinematic fault family (that is, faults produced by rock dislocation rather than an asymmetrical regional stress77). Consequently, this evolutionary model does not predict a significant increase in the dimensions of the fault plane and of the magnitude of the future seismic events, that for Campi Flegrei are expected to a maximum of Mw = 4.444. On the other hand, the PIF evolution will likely result in an increase in its displacement, thus providing a more and more efficient gravity discharge over the resurgent block, potentially resulting in a progressive acceleration of its uplift. This is because in a stress field characterized by a vertical σ1, the presence of a normal fault slip allows to reduce the hanging-wall thickness (i.e., the load on the fault). Initial effects of the PIF normal dislocation could well explain the observed geodetic anomaly with a relative decrease in the uplifting rate around the GNSS station ACAE (Fig. 6b), which, since 2023, is altering the perfect axisymmetric shape of the uplift15, as well as other time-dependent deformation features that have been highlighted to occur in correspondence of the PIF (e.g., refs. 52,78). The area of geodetic anomaly is coherent with a normal displacement toward the NNW, also in agreement with focal mechanisms (Fig. 6b). The dipping of the PIF towards inland (NNE), that is towards inner, higher elevation areas (with a higher gravity load), represents, anyhow, a delaying factor to its displacement. Nevertheless, great care must be taken to monitor its development and evolution, given its influence on resurgence patterns and its potential to provide preferential pathways for uprising hydrothermal and volcanic fluids. We finally remark that the proposed unsupervised analysis of the spatio-temporal evolution of seismicity during volcanic unrest has a general validity and can be applied to any active volcanic system.

Methods

Foreword

The least squares method (LSM) is a widely and successfully used mathematical algorithm to test and identify linear trends of a set of data45 and, with some iterative computations, can be extended also to non-linear trends79. The least squares method is characterized by a fast and simple computation by minimizing the sum of the squares of the deviations of the data values from a given expected trend

$${{{\rm{M}}}}{{{\rm{i}}}}{{{\rm{n}}}}{\sum }_{i}{({z}_{i}-f({x}_{i},{y}_{i}))}^{2}$$

This is easily solved by zeroing the parameter partial first derivatives of this equation.

Despite its advantages, two severe boundary conditions limit the wide and indiscriminate use of the LSM. The first is the assumption that all data must belong to the population following the searched trend shape. This constraint is difficult to fully respect in nature due to the complexity of the real world, where the scattering of measured data may not simply depend on the diffusion of a single process but on several, often independent and overlapping causes, including random ones. The application of the LSM to such complex data populations may result in unreliable or fake solutions.

The second limiting condition is intrinsic in the method. Measures in the LSM approach are weighed by the square of their distance along the dependent variable from the searched trend. This provides, when data belongs to different populations (including randomly scattered ones), a strong bias in the relative relevance of each datum, since measures more distant from the trend will weigh in the computation far more than the data that follow closely the searched trend. This is a strong limitation, since, in a complex situation (i.e., where measures depend on more than a single population/trend as in this study), the farthermost data are those that most likely do not follow the trend under test, yet accordingly they will weigh (by the square of the distance) more than the closer trend data. Therefore, a small percentage of data belonging to a different population is sufficient to invalidate the results.

To override this limit, a common (and dangerously subjective) technique is often applied and consists of the identification and elimination of data more distant from the desired trend, classified as “outliers”. Generally, this is accomplished by discarding from the inversion process data that show the (assumed) larger distance from the initially approximated trend and repeating the inversion by excluding them. This represents a serious external influence on the data and should always be avoided or used with great care, since it is strongly influenced by the subjective rules applied, as well as the relative number of data following the searched trend with respect to those that belong to different populations. The outlier identification is even more complex as the number of data increases, especially when used with a strongly scattered data population, i.e., where only a reduced percentage of data follows the expected trend. This latter scenario is expected in the seismic event location population, such that in the studied Campi Flegrei active region, where the spatial distribution of the events derives from the contemporary action of at the minimum two main causes, the doming related to gas escape and/or rising of a magmatic body, and the potential presence of a fault or more faults acting contemporaneously. The first cause will provide a locally scattered swarm of events. The activation of an active fault or a PIF would provide a concentration of events along the approximated planar fault zone. This scenario closely resembles the distribution of the locations of seismic events during the last Campi Flegrei crisis, where data are consistently distributed along an annular shape around the area, yet with a very strong asymmetry, and most of the events locating more and more in the western coast of the Pozzuoli area within the last years.

The least root method (LRM)

With these premises, a slightly different and objective method is proposed, deriving from Tarantola (2005)45 and used in the present study: the least root method (LRM). Similar to the LSM approach, this method seeks to minimize the sum of the square root of the absolute distances between trend and the dependent variable (elevation z, in our study), thus lowering the weight of the most distant data.

The equation to be minimized is then

$${{{\rm{M}}}}{{{\rm{i}}}}{{{\rm{n}}}}{\sum }_{i}{|({z}_{i}-f({x}_{i},{y}_{i}))|}^{L},$$

where L = 0.5 is the used exponent in the LRM method.

Unfortunately, this equation cannot be easily minimized in analytical formulation in most cases, and the distances must always have a positive value, as automatically results in the case of using their square as in the LSM. This prevents the use of the elegant and fast, analytical solution provided by that method and must be substituted by an iterative Monte Carlo convergent (MCC) type search for the minimum sum by comparing different sets of parameters. This computation is simpler, yet it requires long, iterative computations (easily done, thanks to the computational speed of present-day computers). As in the Monte Carlo method, this result is achieved by an iterative “trial and error” approach testing a huge set of different and independent parameter values, where the advantage resides in the large amount of available and independent parameter sets.

A quantitative representation of the mean residual value is provided by the computation of the root mean power (RMP) value as

$${{{\rm{R}}}}{{{\rm{M}}}}{{{\rm{P}}}}={\left[{\left({\sum }_{i}|({z}_{i}-f({x}_{i},{y}_{i}))|\right)}^{L}/n\right]}^{1/L},$$

where n is the number of data.

It is immediately seen that for L = 2, RMP corresponds to the classical RMS value. In the applied LMR method, the exponent is L = 0.5, thus exchanging the square and square root exponents.

It is worthwhile noticing that the used LRM implementation can be extended to any value of the power of the distances. Its decrease would produce a progressive lowering of the weight of data that are farther from the main trend, with the risk of providing excessive emphasis on random alignment of data along a fictitious trend (as in the case of rounded coordinates along given x or y coordinates).

In order to test the reliability of the application of LRM with respect to other possible minimization solutions, we tested the use of different values of the power L of the distances (i.e., exponents 3, 2 that is the LSM, 1, 0.5 that is the adopted LRM, and 0.333) on a sample of our dataset, i.e., the earthquake hypocenters recorded in 2023 in the PIF area. Results are shown in Fig. 7 and confirm the best reliability of the L = 0.5 approach, with a high number of associated events and reduced (LRM) distance average value.

Fig. 7: Comparison among the results of MCC analyses with different exponents L related to the 2023 events selected within the PIF area.
Fig. 7: Comparison among the results of MCC analyses with different exponents L related to the 2023 events selected within the PIF area.The alternative text for this image may have been generated using AI.
Full size image

MCC was applied for exponents 3 (L3), 2 (L2, LMR), 1 (L1), 0.5 (LRM), and 0.333. Graphs show the values of strike (blue diamonds) and dip (red squares) of the plane solution, the RMP (root mean power) residuals (green triangles), and the number of earthquakes associated with the plane (blue crosses). The total of the analyzed earthquakes is 576. Note the concomitant increase in the number of events and reduction of their distance from the plane (RMP) by reducing the exponent that testifies the presence of “outliers”, that are the earthquakes not associated with the plane solution. The maximum of events close to the plane is achieved for exponents L equal or lower than 0.5, where over 55% of the events fall close to the PIF plane and correspond to the LRM used in this work. At exponent L = 0.5, there is also the minimum of the RMP residuals. It is important to note that the presence of scattered data strongly influences, as expected by the uneven distribution of data in the 3 dimensions, the dip and azimuth of the best-fit plane. There is a small difference between the results from L exponents 0.5 and 0.333. Yet, the smaller the exponent L, the larger the influence produced by the artifact, as the rounding of the event coordinates. Accordingly, the exponent L = 0.5 was considered the best compromise to process the data.

The MCC implementation was proposed in Tarantola (2005)45 and Schmidt and Salvini (2024)46 and is characterized by the search for the possible parameters within a variable range of values for each parameter. A cycle of attempts of random values of parameters, within the given range, is used to compute a plane and then compared by LRM with the data. These ranges are then focused (and reduced) on a successive cycle of attempts around the best parameter set found by the application in the previous cycles, thus providing an unsupervised machine learning approach to the process. These ranges reduce progressively by a given fraction at any cycle of attempts to improve the number of parameter sets closer to the expected satisfactory result and thus improve the probability of a reliable final match. Figure 8 shows a qualitative example of the range reduction of the 3 parameters compared to the progressive parameter best values during the range reduction process in the progressive cycles.

Fig. 8: Quantitative graph generated by MCC of the 2023 earthquakes in the PIF area to show the progressive reduction of parameter search range through cycles.
Fig. 8: Quantitative graph generated by MCC of the 2023 earthquakes in the PIF area to show the progressive reduction of parameter search range through cycles.The alternative text for this image may have been generated using AI.
Full size image

Colors refer to the plane parameters: red is dz/dx; green is dz/dy; blue is z0, i.e., the intersection of the plane to z-axis. Numbers indicate the range scale along the x-axis. The y-axis is the Cycle number (1–60). Continuous lines show the range migration limit values. Squares indicate the best value found at each cycle.

Application of LRM to search of a planar trend within a biased dataset

In this study, the MCC method was used with the LRM approach by a set to 60 cycles of 200,000 attempts and a parameter range reduction fraction r = 0.1 at each new cycle. This change in the parameter range at each cycle provides a new independent set of random parameter values within the 16-digit precision of the computation. The code is developed in compiled VB6 (Microsoft) and computation time for each MCC inversion is strongly dependent on the computer performance used and directly proportional to the number of data. Our application was able to perform the required 12,000,000 iterations with 1500 measures in about 1 h. The final range decrease of the parameter ranges, after the 60 cycles, was eventually reduced to (1−0.1)60 = 0.001797010 times the initial ranges, thus providing a very high resolution in the best parameter final values (Fig. 8).

In the implementation of the MCC used in this study, we used a set of random number generators with over 1.6 × 107 numbers. This set was then used to generate random parameter values within the given parameter range at that cycle, with a resolution of the extremes of 16 digits (i.e., over 1016 possible combinations). In this way, this procedure provides for each cycle a new set of values different from all the previous ones. The theoretical number of independent random values in this study, therefore, exceeds 1016 independent random values. The test on the used procedure showed no value repetitions within 3 × 1010 random numbers generated, thus fully satisfying the used number of parameters set. Furthermore, at regular intervals (i.e., every 43,051 attempts in our application), the range parameter boundaries were randomly modified by 0.0001% to prevent any useless and dangerous repetition.

This easily overrides the required number of random values required for an MCC processing (see below), which is equal to:

No. of cycles * No. of attempts * No. of independent parameters = 60 * 200,000 * 3 = 48,000,000.

The MCC implementation of the LRM method has been applied to identify the presence of active potentially incipient faults (PIF) by searching for planar trends within the highly scattered locations of seismic events.

In this analysis, we consider that each measured event belongs to either the planar distribution of a PIF or to scattered data (the doming?)

$${{{\rm{D}}}}_{{{\rm{i}}}}(x,y,z)={{{\rm{R}}}}_{{{\rm{i}}}}|{{{\rm{F}}}}_{{{\rm{i}}}}$$

where R is the scattered population and F is the searched planar feature represented as

$${{{\rm{z}}}}(x,y)={{{\rm{a}}}}(x+\Delta x)+{{{\rm{b}}}}(y+\Delta y)+{{{\rm{c}}}}$$

and represents events clustering along a planar surface trend.

z is the hypocenter depth (converted as a negative value, i.e., elevation). The unknown parameters are a, b, c, which represent the analytical parameters of the searched plane in a space centered at Δx and Δy.

x, y are the UTM coordinates of the event location (converted from the original Geographic coordinates), and z is the expected depth of the plane at position x, y.

The application of the method for our purpose is therefore to provide the best parameter values to minimize the equation of the sum of the root of the distance from the event computed depth and the expected depth at the given point.

$${{{{\rm{M}}}}{{{\rm{i}}}}{{{\rm{n}}}}{\sum }_{i}\left|\left[(\left(\right.{z}_{i}-a({x}_{i}+\Delta x)-b({y}_{i}-\Delta y)-c)\right]\right|}^{L}$$

Δx and Δy parameters were fixed in the analysis and centered in the mean x and y measured values:

$$\Delta x=\left({\sum }_{i}{x}_{i}\right)/{{{\rm{N}}}}$$
$$\Delta y=\left({\sum }_{i}{y}_{i}\right)/{{{\rm{N}}}}$$

where xi, yi are the horizontal coordinates of the i event and N is the number of total events; L = 0.5.

At each attempt, a set of parameters \({P}_{j}\) (i.e. the parameters a, b, c) is generated as

$${{{\rm{P}}}}_{{{\rm{j}}}}={{{{\rm{P}}}}{{{\rm{r}}}}{{{\rm{a}}}}{{{\rm{n}}}}{{{\rm{g}}}}{{{\rm{e}}}}}_{{{\rm{j}}}}\ast {{{\rm{r}}}}{{{\rm{n}}}}{{{\rm{d}}}}+{{{\rm{P}}}}{{{{\rm{m}}}}{{{\rm{i}}}}{{{\rm{n}}}}}_{j}$$

where Prangej = (Pmaxj–Pminj) and represents the width of the searched interval, and rnd is the (next) generated random value. Pmaxj, Pminj are the range limits of the j parameter at the given cycle.

The initial parameter ranges are:

a

Pmin1 = −100

Pmax1 = 100

b

Pmin2 = −100

Pmax2 = 100

c

Pmin3 = −1,000,000 m

Pmax3 = 1,000,000 m

At the completion of each cycle i of attempts, the interval ranges Prangej are narrowed and recomputed around the best inversion results obtained in the previous cycle, according to:

Prangej,i+1 = Prangej,i*(1–r), where r in the range reduction fraction (0.1 in this study)

Pminj,i = Pbestj–(0.5*Prangej,i+1) where Pbesti is the best approximation throughout the cycles.

At the completion of the last 60th cycle, the final parameter search ranges in this study were:

Prange1 = 0.359

Prange2 = 0.359

Prange3 = 3594