Principal component analysis on twenty years (2000–2020) of geochemical and geophysical observations at Campi Flegrei active caldera

Petrillo, Zaccaria; Tripaldi, Simona; Mangiacapra, Annarita; Scippacercola, Sergio; Caliro, Stefano; Chiodini, Giovanni

doi:10.1038/s41598-023-45108-0

Download PDF

Article
Open access
Published: 27 October 2023

Principal component analysis on twenty years (2000–2020) of geochemical and geophysical observations at Campi Flegrei active caldera

Zaccaria Petrillo¹,
Simona Tripaldi²,
Annarita Mangiacapra¹,
Sergio Scippacercola¹,
Stefano Caliro¹ &
…
Giovanni Chiodini³

Scientific Reports volume 13, Article number: 18445 (2023) Cite this article

3924 Accesses
9 Citations
Metrics details

Subjects

Abstract

Campi Flegrei (CF) is an active and densely populated caldera in Southern Italy, which has manifested signs of significant unrest in the last 50 years. Due to the high volcanic risk, monitoring networks of the most sensitive unrest indicators have been implemented and improved over time. Precious database constituted by geophysical and geochemical data allowed the study of the caldera unrest phases. In this paper we retrace the caldera history in the time span 2000–2020 by analyzing displacement, seismicity and geochemical time series in a unified framework. To this end, Principal Component Analysis (PCA) was firstly applied only on geochemical data because of their compositional nature. The retrieved first three components were successively analyzed via PCA together with the geophysical and thermodynamical variables. Our results suggest that three independent processes relay on geochemical observations: a heating/pressurizing of the hydrothermal system, a process related to magmatic fluids injection at the hydrothermal system roots, and third process probably connected with a deeper magmatic dynamic. The actual volcano alert state seems mainly linked to the variation of the hydrothermal system activity. Our approach made it possible to explore the interrelation among observations of different nature highlighting the importance of the relative driving processes over time.

3D structure of the Campi Flegrei caldera central sector reconstructed through short-period magnetotelluric imaging

Article Open access 02 December 2022

Statistics of seismicity to investigate the Campi Flegrei caldera unrest

Article Open access 30 March 2021

The seismicity of Campi Flegrei in the contest of an evolving long term unrest

Article Open access 21 February 2022

Introduction

In the last decades great attempts have been made to understand the complexity of the volcanic systems structures^1,2,3 and to define the chemical and physical processes characterizing the associated hydrothermal systems^4,5, united with their evolution through time. These attempts have been mainly addressed to the mitigation of the volcanic hazard in densely populated areas, where explosive catastrophic eruptions are expected. Campi Flegrei caldera (CFc, Fig. 1) in Southern Italy, represents a suitable example being a particularly dangerous active volcanic site, inhabited by more than 1.5 million people.

During its history, the caldera has experienced two very large explosive events that led to the formation of its primary structure: the Campanian Ignimbrite and the Neapolitan Yellow Tuff eruptions^6,7,8 then modified by the more recent Agnano Monte Spina and Astroni explosive eruptions, as well as by the numerous successive eruptions, which generated pyroclastic deposits spread over an extremely large area^8,9. The last eruption in CF occurred in 1538 AD¹⁰. Currently the caldera magmatic system is still active, as testified by bradyseismic episodes and by a widespread fumarolic and thermal springs¹¹, leading this area to a very high volcanic risk.

Over the years the scientific community dealing with CF has mainly focused their attempt on improving the monitoring systems devices^{12,13,14,15,16,17,18,19,20} and developed numerous models to simulate volcanic and hydrothermal processes^{4,21,22,23,24,25,26,27,28,29}. These efforts have been made in order to obtain reliable data and sophisticated simulators able to improve models confidence and predictive capabilities addressed to mitigate the volcanic risk and to prevent and estimate possible catastrophic scenarios.

Objectively, there is a certain difficulty to treat and interpret large sets of multiple variables data resulting from long-term monitoring. The basic strategy generally used is the univariate statistical analysis, which can cause, however, uncertainty and error when dealing with huge and inhomogeneous dataset³⁰. In order to avoid this problem, multivariate statistical techniques can be used, as they are unbiased methods which can help indicate natural associations between samples and/or variables³¹, thus highlighting information not available at first glance.

In this study, we treat a large dataset of geophysical and geochemical data resulting from long-term (2000–2020) monitoring of the Campi Flegrei caldera. In particular, we examine, vertical displacements, earthquakes occurrences, geochemical composition of fumaroles and the estimated temperature and pressure time series using the Principal Component Analysis (PCA).

PCA is a multivariate statistical procedure widely applied for data processing and dimension reduction, when large multivariate datasets are analyzed. The PCA simplifies the data structure and helps data interpretation³². The great advantage³¹ lies in giving the ability to detect broader patterns of interrelationships among data than given by individual univariate analyses³⁰. The main goal of this study is to identify and distinguish the different processes which in the last twenty years have been the protagonists of the ongoing crisis in the CF caldera and how these processes statistically interact. The geochemical fumaroles fluids samples, being compositional data, require a prior appropriate transformation according to theory of compositional data^{33,34,35,36,37,38}. Via this transformation, it is possible to perform an analysis in Principal components to extract components which contain the synthesis of the geochemical processes acting in the CF caldera. Successively, the integration of the geochemical data with the geophysical data is carried out through a joint principal component analysis from which it is possible to identify the relationships between the physical processes and the geochemical variables synthesis involved in the analysis. The relationships found between geophysical and geochemical processes, allowed us to highlight the main process responsible for the last crisis.

Materials and methods

Compositional data analysis

All the statistical measures (e.g. mean, standard deviation, correlation, etc.) are defined in Euclidean space and, therefore, the usual univariate and multivariate analysis can lead to erroneous results when applied to compositional data. The constant sum for each observation limits the Euclidean sampling space to a Simplex, subspace of ${\mathfrak{R}}^{p}$. According to the current literature, only the transformation of compositional data by CoDA methodology^{33,34,35,36,37,38} allows a correct approach to the statistical data analysis. The three main approaches for modeling compositional data analysis are an additive log-ratio (alr), a centered log-ratio (clr)^34,39 and the isometric log-ratio (ilr)^40,41. The clr transformation is isometric³⁹ and allows an interpretation of the relationships one-to-one among the compositional variables being a subspace of p dimensions⁴². Therefore, in this work, each compositional time series X(t) will be treated according to the clr methodology:

$${\varvec{Z}}\left( t \right) = clr\left[ {X\left( t \right)} \right] = \user2{ }\left[ {ln(x_{ij} \left( t \right)/g_{t} )} \right]\left( {i = 1, \ldots ,n;j = 1, \ldots ,p;t = t_{0} , \ldots ,T} \right)$$

(1)

where n is the number of observations, p the number of variables and g_t is the geometric mean of each compositional data at t time (t ∈ T).

Principal component analysis and biplot

PCA linearly transforms the original variables into a smaller new set of uncorrelated variables⁴³ each new variable is a linear combination of the old. The Principal components (PCs) are ordered so that the first few retain most of the information present in all of the original dataset. PCA can be implemented in two modalities: linear or via Decomposition into Singular Values (SVD)^44,45. The first principal component is a linear combination of all variables with the greatest variability, the second principal component represents the greatest variability after the effect of the first has been removed, end so on. Since only the first components explain a significant part of the total variance, the remaining PCs can also be ignored. SVD is the most used method especially when one is interested in representing results with few dimensions. In this work we use SVD, following Aitchison's suggestion, since the compositional data have an adequate representation in the biplot. The Decomposition into Singular Values (SVD) of the X data matrix, centered or standardized is ${\varvec{X}}={\varvec{U}} {\varvec{\Lambda}} {\varvec{V}}\boldsymbol{^{\prime}}$, where U(n,p) are left eigen-vectors of XX’, ${\varvec{\Lambda}}$(p, p) are the eigen-values, and V’(p, p) are the right eigen-vectors of X’X (U and V are orthogonal). Splitting ${\varvec{\Lambda}}$ into ${{\varvec{\Lambda}}}^{\boldsymbol{\alpha }}{ {\varvec{\Lambda}}}^{\boldsymbol{\alpha }-1}$, where $\left(0\le \alpha \le 1\right)$. The SVD of matrix X becomes:

$$\user2{ } {\varvec{X}} = ({\varvec{U}} {\varvec{\varLambda}}^{{\varvec{\alpha}}} )\left( {{\varvec{\varLambda}}^{{1 - {\varvec{a}}}} \user2{V^{\prime}}} \right) = {\varvec{G}} {\varvec{H}}{^{\prime}}$$

(2)

where ${{\varvec{G}}={\varvec{U}} \boldsymbol{\Lambda }}^{\boldsymbol{\alpha }}$ and ${\varvec{H}}={\boldsymbol{\Lambda }}^{1-\boldsymbol{\alpha }} {\varvec{V}}^{\prime}$.

Biplot^34,46 is a graphical tool that provides, in the factorial plane, the results of analysis in Principal Components (PCA) of a matrix X (n, p). The bi prefix indicates that the plot contains information on the n observations and on the p variables. In a biplot, the n rows of the matrix G (n, 2) are represented as points-units (scores), corresponding to the observations, and the p rows of the matrix H (p, 2) are represented as vectors (rays), corresponding to the variables (loadings). The length of each vector starting from the origin of the axes, approximates the variance of the respective variable; the angle between two rays (cosine of the angles between the rays) approximates the correlation between the variables they represent; the projection of a generic data on a specific vector approximates the value of that data with respect to the variable represented by the vector. The α value in (2) can range from 0 to 1. In this work, we are interested to the covariance-biplot (α = 0) that preserves the covariance structure⁴⁷ and privileges the display of the variables.

In the case of compositional data matrix, SVD can be applied after a clr, alr or ilr transformation³⁴. After this application, the approximation structure of the correlations among data can be deduced. The biplot of compositional data (relative variance biplot), shows the rays (projection of the original variables onto this orthogonal space) and the links between two rays (the ratios among geochemical species). The links, that is the approximation of the geochemical ratio by the new subspace, are compositionally invariant, this mean they are good physical observables. If two links are orthogonal, an independent relationship between the two sub-compositions (equivalently the two sub-compositional ratios) is estimated; if the link between two rays is short or their vertices almost coincide, the relationship between the two log-ratios is constant and the two log-ratio are proportional^37,48. The barycentre of Biplot represent the geometric mean, used in the clr transformation, and can be considered as a reference to evaluate the increase or the decrease of a variable respect to the whole composition. The ray length of the compositional data (for example the length of CO in Fig. 7) indicates the behaviour of the variable with respect to the centre of gravity (barycentre of Biplot) and provides an estimate of the standard deviation of a variable (approximately proportional to the own standard deviation, note that we are in a subspace, that is we have approximated the processes).

Dataset and preliminary analysis

The raw geochemical dataset consists of time series of nine geochemical variables sampled with a frequency of about a sample/month; this dataset is constituted of 243 chemical analyses⁴⁹ obtained from samples collected at the main fumarolic vent in Solfatara, Bocca Grande (BG, Fig. 1), between August 2000 and October 2020. Sampling techniques and analytical procedures are reported in Caliro et al.¹². Chemical data of fumarolic fluids are expressed as micromole/mole (µmol/mol) for H₂O, CO₂, H₂S, Ar, N₂, CH₄, H₂, He, CO gas species. Fumarolic gases do not show any detectable SO₂, HCl, and HF, due to the scrubbing of magmatic gases within the hydrothermal system^12,50,51. The gas chemical compositions exhibit significant changes over time due to the periodic contributions of hotter and more oxidizing magmatic fluids entering the hydrothermal system^{4,17,28,49,52,53,54}.

The examination of the behaviour of a single geochemical time series is distortive because the data are of a compositional nature³³. Each statistical analysis is correct considering the ratios between two gas species (ratio is compositional invariant) although the number of independent ratios [(p-1)²/2] makes complex a global analysis of the geochemical data based on the ratios. We will use the PCA analysis to explore the structure of the compositional dataset and to obtain a new and reduced number of variables for further analysis.

Before applying the PCA, we used the ilr transformation to cut-out outliers and to resample geochemical data at a constant step of a sample/month. Then we back transformed data to the simplex space. In fact, the ilr subspace, having orthogonal axes and p-1 dimensions, allows the interpolations of each variable independently from the others^33,55,56. After the interpolation, the clr transformation was adopted.

The geochemical time series resampled and transformed in clr are shown in Fig. 2, while their values are in the “Supplemental material”. In particular, it seems that, from a geochemical point of view, the crisis starts around the year 2006 with a clear fall of the CH₄ content in the volcanic gasses with consequent increase in the typical ratios used to monitor the volcanic deep activity (CO₂/CH₄ and He/CH₄).

Moreover, geochemical variables derived from gas equilibria have also been considered in the multivariate approach. Equilibrium temperatures and pressures, in the CO-CO₂ gas system, are computed according to Chiodini⁴⁹ considering the water fugacity controlled by the steam-liquid coexistence⁵⁷ and redox conditions fixed by the D’Amore and Panichi⁵⁸ empirical buffer.

Temperature is a function of ratio CO/CO₂; H₂O pressure is a function of Temperature; CO₂ pressure is a function of ratio H₂/CO (all the derived quantities depend on compositional invariant variable); the derivation of the geothermometric and geobarometric function was computed according to Chiodini et al.⁴⁹, considering (i) f_H2O fixed by the vapour-liquid coexistence and (ii f_O2 as a function of the temperature. Redox conditions of Solfatara gases were assumed to be controlled by the DP buffer (log f_O2 = 8.20—23,643/T). The correspondent geothermometric relations are: T = 3133.5 / (0.933- Log (X_CO /X_CO2)). The geobarometric functions are: Log P_H2O = 5.510—2048/T, where the water pressure is assumed equal to water fugacity of saturated vapor (i.e. vapor–liquid coexistence for pure water⁵⁷), Log P_CO2 = 3.025 + 201/T -Log (X_H2 /X_CO) and P_tot ~ P_CO2 + P_H2O (Fig. 3).

The vertical displacements dataset is composed of monthly averaged measurements recorded at the RITE station by the Neapolitan Volcanoes Continuous GPS (NeVoCGPS) network from 2000 to 2020 (Fig. 4A). The daily original recordings are available in Tramelli et. al.⁵⁹. During the analyzed period, a subsidence phase switched to a slow uplift around 2005 and rose to a fast uplift phase in 2012 that is still ongoing. The maximum value was reached in the last analyzed year (2020) and was of 68.62 cm. Following the hypothesis that the complex displacement pattern is generated by the superposition of deformation processes separated in frequency (first intuited by Chiodini et al.⁵ and then formalized in Petrillo et al.⁶⁰), the time series has been separated into two distinct time series: the trend obtained by a second order polynomial fitting (Fig. 4B) and the oscillating component (Fig. 4C) obtained by subtracting the trend. Hereafter, these two time series will be mentioned respectively as z-trend and z-osc. It is noteworthy that in the z-osc series is evident a peak around the year 2006 (as in the CH₄ geochemical data series) as well as in the original up-lift time series where, around the same year, there is a step followed by an increasing trend (Fig. 4 A).

The seismicity dataset is composed by monthly number of earthquakes located in the CFc area (Fig. 1). As reported by the catalogue of Osservatorio Vesuviano, Istituto Nazionale di Geofisica e Vulcanologia, between August 2000 up to October 2020, 2459 seismic events mainly occurred beneath the Solfatara-Pozzuoli area at depths from 0 to 4.46 km with the exception of a single (low quality and here not considered for further analysis) event at a depth of 7 km. The Gutenberg–Richter distribution⁶¹ closely fit the data with magnitude M ≥ − 0.5. In this study we have selected 1866 volcano tectonic earthquakes with M ≥ -0.5 at which 93% of the observed data (events) are modelled by a straight line⁶². From 2000 to 2020, seismicity increased in time and clustered to shallow depth. Analyzing the recent seismic activity, Chiodini et al.⁵⁴ found that seismicity is distributed into low (swarms) and high (background) interarrival time populations^63,64,65. The skewness of the hypocentral frequency distribution (in km) is 1.45, the mean is 1.34 with a standard deviation of 0.63, and the median is 1.27, which indicates a crowding of hypocentres towards shallow levels. We assume that there is a spatial discrimination between the two groups of earthquakes based on their depth. To test this hypothesis, a clustering algorithm (hierarchical linkage and Euclidian distance)⁶⁶ on the hypocentral depths was applied. The resultant dendrogram (cophenetic index = 0.79) is shown in Fig. 5A.

Two distinct depth classes are evidenced (Fig. 5B,C): one between 0 and 2.36 km (first cluster, Fig. 5B) and the other below 2.36 km (second cluster, Fig. 5C). In Fig. 6 the two earthquakes time series related to the two clusters are shown. The very interesting result is the low value of temporal correlation (0.20) between the shallow and deep earthquakes time series, which, with the spatial clustering, strongly supports our hypothesis that there is the presence of two distinct (time/space) physical seismic processes. The structure of the earthquakes occurrence again shows, in a coherent manner with the others geophysical and geochemical variables, a significative peak of the earthquakes occurrence around the 2006.

In particular, the correlation between the original ground displacement (Fig. 4A) and the earthquakes time series related to the first cluster is 0.72, while the correlation between the ground displacement and the earthquakes time series related to the first cluster is 0.21.

Hereafter the shallow and deep earthquakes occurrence time series will be mentioned respectively as: s-E and d-E.

Results and discussion

Principal component analysis on geochemical species

To investigate the distribution of the geochemical variable in the reference periods, a PCA was applied on monthly sampled data, clr transformed by the Eq. (1) to obtain a relative covariance Biplot (Fig. 7) (Tables 1 and 2).

Table 1 Correlations among variables and PCs of the nine geochemical species in Fig. 7.

Full size table

Table 2 Squared cosines referring to the PCA of the nine geochemical species in Fig. 7.

Full size table

We look for hidden variables that control the changes of the fumarolic fluids compositions over time. To do this we will find the links which are best represented by the first three PCs (Tables 3, 4, 5).

Table 3 Correlations between the geochemical ratios and the first PC.

Full size table

Table 4 Correlations between the geochemical ratios and the second PC.

Full size table

Table 5 Correlations between the geochemical ratios and the third PC.

Full size table

Looking at the biplot of the first three components (Fig. 7) (95% explained variance), CO and CH₄ have the longest rays and, therefore, exhibit a much greater variability, in the time interval examined, respect to all the other geochemical species.

On the left side of the biplot (Fig. 7) there are the scores related to the period 2000–2005 (blue points) which reveal higher content in H₂O, N₂ and CH₄. If we follow the scores evolution trough time, we can see that the system at t = 2000 is richer in CH₄ than at t > 2005, when He and CO₂ enter the hydrothermal system and start to play a major role in the score distribution (projection). In the transition period (from 2005 to 2017) He and CO₂ mark a change in the hydrothermal system state evolution. From the year 2017 to 2020 we notice a decrease in the CO₂ and He content and an increase of CH₄ in the hydrothermal system. CO, which is the ray with higher variance, dominate the general trend.

In the context of the compositional theory, the analysis of individual species, in the clr space, leads to general results, but in particular we will use, even, the principal ratios (links) which are adopted in volcanic environments that can be geochemically interpreted. Therefore, we consider the correlation between the main ratios and the first three PCs (Tables 3, 4, 5).

The links (ratios) with geochemical significance which have high correlation with the first PC are CO/CO₂ (r = 0.96) and CO/H₂ (r = 0.95) (Table 3 and Fig. 8). This associations and these ratios are notable because CO and H₂ are in general considered as gas species controlled by the Temperature and Pressure conditions of the hydrothermal systems at depth⁶⁷. It should be emphasized that CO is related to all compositional species and all these relationships are strongly correlated with the first component (r ≥ 0.91). Considering that CO is the most representative species of the fumarole gas, delineating a warning trend⁵³ of the hydrothermal system conditions, we can hypothesize that the first axis (Figs. 7 and 8) can be interpreted as a latent process of the hydrothermal system heating/pressurizing. In the follow, we name this first principal component as GF1 (first geochemical factor).

The second component orthogonal to the previous one, opposes CH₄ and H₂S to He, Ar and CO₂ (Fig. 8 and Tables 1, 2). In the period 2000–2006 the second principal component is positively dominated by CH₄ (which indicate a probably absence of deep magmatic fluids injection in the hydrothermal system) while, in the intermediate period 2007–2017, the contribution of He and CO₂ determines a transition with a relative decrease of CH₄. The last period (2017–2020) is again characterized by a relative increase of CH₄. In the factorial plane, Ar is present in the transition period, however, its occurrence is not further discussed due to likely significant air contamination (see the wide high frequency oscillation in the Ar signal of Fig. 2).

Even for the interpretation of the second axis we resort to the analysis of the correlations between the geochemical ratios and the second component (Table 4). The ratios representative of the hydrothermal dynamics He/CH₄ ratios (r = − 0.88) and CO₂/CH₄ (r = − 0.83), are well projected on the second component.

These last ratios have been suggested by Chiodini et al.^{5,15,28,53,68} as indicative of the arrivals of magmatic fluids in the hydrothermal system from depth. The CO₂/CH₄ and He/CH₄ ratios have been interpreted as powerfull indicators of magma degassing episodes⁶⁵. The magmatic gases entering the hydrothermal system are, in fact, relatively rich in CO₂ and He and poor in CH₄, a specie that is formed in the hydrothermal environment. Therefore, the second axis could represent a deep rooted hydrotermal process, that around 2006–2007 (see the projection of the data on the links CO₂/CH₄ and He/CH₄ in Fig. 8) generated the pressure and temperature increase well represented by the first principal component.

We underline that the first component is very strongly correlated (angles among the rays and the first axis very near zero, Fig. 7) with the CO increasing trend (Fig. 2) and H₂O reverse trend (Fig. 2). The second component, orthogonal to the first one, represents an independent process dominated by CH₄. Note, orthogonality implies that, even if the second component (deep magmatic fluid batches) could be the cause of the first, their direct statistical dependence must have been lost over time. In the follow, we call the second principal component GF2 (second geochemical factor).

The third component (12% explained variance) is dominated during the intermediate period (2007–2012) by CO₂ opposed to Ar and H₂S (Fig. 7; Table 1). CO₂ is the best represented among the geochemical species on this axis (Table 2). We can hypothesize that the third principal component (hereafter GF3, third geochemical factor) can be considered a latent factor linked to the production of CO₂, probably from deeper zone of the volcanic system.

Joint PCA on geochemical and geophysical data

We were interested in the relationship over time among the geochemical data (Fig. 2), the geophysical data (Fig. 4B and C and Fig. 6B and C, and the derived T–P functions (Fig. 3). The Principal components analysis⁶⁹ was conducted on the multivariate dataset where the geochemical data are represented by GF1, GF2 and GF3 obtained from the previous analysis. We found that the 85% of the total variance was explained by the first three components. The graphical representation of the Biplot in Fig. 9 (together with the inferences obtained from the data listed in Tables 6 and 7) shows an impressive development of the scores trajectory, representing the state of the system. The dynamic, in the early year (2000–2008), is constrained substantially in a volume (parallelepiped in Fig. 9) mainly described by the second and the third PC, here the dynamics is governed by GF2, GF3, deep earthquakes and z-oscillation. From the year 2009 up to 2014 the system is losing its stability and tends to invade the first PC. Around the year 2015 up to the year 2020, the system has a violent impulse and the scores pattern moves away from the parallelepiped along the first PC.

Table 6 Correlations among the first three PCs and the variables shown in Biplot of Fig. 9.

Full size table

Table 7 Squared cosines of the variables referred to the Biplot in Fig. 9 (in bold the values that correspond. for each variable. to the factor for which the squared cosine is the largest).

Full size table

The first PC (Fig. 9) is strongly and positively correlated with the z-trend, CO estimated temperature, H₂O estimated pressure and GF1. More moderate, but always very significant, is the association of shallow earthquakes and CO₂ estimated pressure with the first component (Tables 6 and 7).

These correlations, together with the spatial correspondence between shallow earthquakes (located between 0 and 2.36 km at depth) and the hydrothermal system, support a strong link between the two. According to Chiodini et al., 2021 heating/pressurizing of the hydrothermal system plays an active role in triggering low magnitude seismicity at shallow depth⁴⁹. The group of variables which characterize the first component implies a very interesting and generally original interpretation of the deformation trend which could be led by the hydrothermal heating/pressurizing in the first 2–3 km of the volcanic apparatus. This part of the hydrothermal system should be responsible of the deformation trend (bradyseism) as well as of the shallow earthquake occurrence. In fact, the advective/convective fluids transport mechanism increases the stress/strain by fluid pressure and so the earthquakes occurence. We interpret this first component, as representing fluids related processes occurring in the hydrothermal system.

The second component of PCA (Fig. 9) is strongly related only to GF3 (Tables 6 and 7) whose major variability is found around 2010–2012; note that the scores of this period are fully projected on the GF3 vector which, under the hypothesis that the GF3 factor is strongly correlated with the CO₂, could be the trigger of the following crises, well represented by the first PCA (Fig. 9). This leads to interpret the second component (Fig. 9) as representing very deep volcanic processes related to the production of CO₂. The second component has weaker negative correlations with z-osc and shallow earthquakes, while has weak positive correlations with GF2, deep earthquakes and CO₂ estimated pressure. Notably this component shows an absolute independence from the z-trend.

The third component of PCA (Fig. 9) is strongly related to GF2 (Tables 6 and 7) and to z-oscillations whose major variability is found around 2002–2012, from blue colour up to green in Fig. 10. Since GF2 could represents a process connected to the deep injection of magmatic fluids, we can suggest that the z-oscillations (in agreement with Chiodini et al.⁵) are generated by CH₄—poor magmatic injections at the root of the hydrothermal system, probably below the region that hosts the most earthquakes (Fig. 1), as suggested by the very low correlations showed in Table 6.

Conclusions

Twenty years of geochemical, thermodynamical and geophysical observations at CFc were analyzed by means of the PCA method. The goal was to reveal the basic phenomenon responsible for the ongoing state of a volcanic crises which started around 2006 and perturbed the physical–chemical CFc magmatic state. The current volcanic hazard level (yellow) has been determined by the increase in intensity and frequency of the main unrest indicators: earthquakes occurrence, ground deformation and volcanic gas flux at surface. The multivariate statistical analysis, we have applied, suggests processes occurring both in the hydrothermal and in the magmatic system, describing how these processes evolve through time.

The heating/pressurizing processes strongly dominate the multivariate space together with the deformation trend and the shallow earthquakes occurrences (first geochemical/geophysical PCA) during the whole period; however, the primary contribution to this PCA is driven by the recent steeply increase of these processes. We interpret this first component, as representing processes occurring in the hydrothermal system and dominating the ongoing unrest.

The second and the third components that consider modulation of the processes through the variables oscillation, show substantially a significant projection of GF2, GF3, deep earthquakes and z-oscillation. These components, associated to deep volcanic processes, dominate the years 2000–2012. Injection of deep heated fluids, at the base of the hydrothermal system, could be responsible for the deformations pulses, as already discussed in Chiodini et al., 2015⁵. The trigger of the CFc volcanic unrest could be driven by the GF3 variability.

In conclusion, we can state that the unified and integrated approach on geochemical and geophysical indicators, applied in this study, has allowed to reveal the hidden and independent processes at the base of the CF volcanic crises, not clearly identifiable considering just a subset of them.

The results of this study are the basis for the identification of further and perhaps more effective geochemical relationships useful to improve the monitoring of the evolutionary volcanic processes which affect calderas similar to the Campi Flegrei one.

The adopted strategy, using the compositional theory applied on geochemical data from the CF caldera, offers a global interpretative framework with the confirmation that the geochemical processes are a keystone in the interpretation of the volcanic phenomena at CF caldera. This methodology is certainly applicable to other calderas in the world in a similar state of hydrothermal activity.

Data availability

The datasets analyzed during the current study are available in the section: “Supplementary Information”.

References

Acocella, V., Di Lorenzo, R., Newhall, C. & Scandone, R. An overview of recent (1988 to 2014) caldera unrest: Knowledge and perspectives: CALDERA UNREST. Rev. Geophys. 53, 896–955 (2015).
Article ADS Google Scholar
Sparks, R. S. J., Biggs, J. & Neuberg, J. W. Monitoring volcanoes. Science 335, 1310–1311 (2012).
Article ADS CAS PubMed Google Scholar
Buono, G. et al. New insights into the recent magma dynamics under Campi Flegrei caldera (Italy) from petrological and geochemical evidence. J. Geophys. Res. Solid Earth 127, e2021JB023773 (2022).
Article ADS CAS Google Scholar
Chiodini, G. et al. Magmas near the critical degassing pressure drive volcanic unrest towards a critical state. Nat. Commun. 7, 13712 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Chiodini, G. et al. Evidence of thermal-driven processes triggering the 2005–2014 unrest at Campi Flegrei caldera. Earth Planet. Sci. Lett. 414, 58–67 (2015).
Article ADS CAS Google Scholar
De Vivo, B. et al. New constraints on the pyroclastic eruptive history of the Campanian volcanic Plain (Italy). Mineral. Petrol. 73, 47–65 (2001).
Article ADS Google Scholar
Deino, A. L., Orsi, G., de Vita, S. & Piochi, M. The age of the Neapolitan Yellow Tuff caldera-forming eruption (Campi Flegrei caldera – Italy) assessed by 40Ar/39Ar dating method. J. Volcanol. Geotherm. Res. 133, 157–170 (2004).
Article ADS CAS Google Scholar
Isaia, R., Marianelli, P. & Sbrana, A. Caldera unrest prior to intense volcanism in Campi Flegrei (Italy) at 4.0 ka BP: Implications for caldera dynamics and future eruptive scenarios. Geophys. Res. Lett. 36, 213–217 (2009).
Article Google Scholar
Isaia, R., D’Antonio, M., Dell’Erba, F., Di Vito, M. & Orsi, G. The Astroni volcano: the only example of closely spaced eruptions in the same vent area during the recent history of the Campi Flegrei caldera (Italy). J. Volcanol. Geotherm. Res. 133, 171–192 (2004).
Article ADS CAS Google Scholar
Di Vito, M., Lirer, L., Mastrolorenzo, G. & Rolandi, G. The 1538 Monte Nuovo eruption (Campi Flegrei, Italy). Bull. Volcanol. 49, 608–615 (1987).
Article ADS Google Scholar
Allard, P., Maiorani, A., Tedesco, D., Cortecci, G. & Turi, B. Isotopic study of the origin of sulfur and carbon in Solfatara fumaroles, Campi Flegrei caldera. J. Volcanol. Geotherm. Res. 48, 139–159 (1991).
Article ADS CAS Google Scholar
Caliro, S. et al. The origin of the fumaroles of La Solfatara (Campi Flegrei, south Italy). Geochim. Cosmochim. Acta 71, 3040–3055 (2007).
Article ADS CAS Google Scholar
Weber, E. et al. Development and testing of an advanced monitoring infrastructure (ISNet) for seismic early-warning applications in the Campania region of southern Italy. In Earthquake Early Warning Systems (eds Gasparini, P. et al.) 325–341 (Springer, 2007).
Chapter Google Scholar
Iannaccone, G., Guardato, S., Vassallo, M., Elia, L. & Beranzoli, L. A new multidisciplinary marine monitoring system for the surveillance of volcanic and seismic areas. Seismol. Res. Lett. 80, 203–213 (2009).
Article Google Scholar
Chiodini, G. et al. Long-term variations of the Campi Flegrei, Italy, volcanic system as revealed by the monitoring of hydrothermal activity. J. Geophys. Res. 115, B03205 (2010).
ADS Google Scholar
Somma, R. et al. Application of laser scanning for monitoring coastal cliff instability in the pozzuoli bay, coroglio site, posillipo hill, Naples. In Engineering Geology for Society and Territory Vol. 5 (eds Lollino, G. et al.) (Springer, 2014).
Google Scholar
Vilardo, G., Sansivero, F. & Chiodini, G. Long-term TIR imagery processing for spatiotemporal monitoring of surface thermal features in volcanic environment: A case study in the Campi Flegrei (Southern Italy). J. Geophys. Res. Solid Earth 120, 812–826 (2015).
Article ADS Google Scholar
Carlino, S. et al. Distributed-temperature-sensing using optical methods: A first application in the offshore area of Campi Flegrei caldera (Southern Italy) for volcano monitoring. Remote Sens. 8, 674 (2016).
Article ADS Google Scholar
Queiβer, M., Granieri, D. & Burton, M. A new frontier in CO₂ flux measurements using a highly portable DIAL laser system. Sci. Rep. 6, 33834 (2016).
Article ADS PubMed PubMed Central Google Scholar
Zaccarelli, L. & Bianco, F. Noise-based seismic monitoring of the Campi Flegrei caldera. Geophys. Res. Lett. 44, 2237–2244 (2017).
Article ADS Google Scholar
Pappalardo, L., Piochi, M., Dantonio, M., Civetta, L. & Petrini, R. Evidence for multi-stage magmatic evolution during the past 60 kyr at Campi Flegrei (Italy) Deduced from Sr, Nd and Pb Isotope Data. J. Petrol. 43, 1415–1434 (2002).
Article ADS CAS Google Scholar
Rossano, S., Mastrolorenzo, G. & De Natale, G. Numerical simulation of pyroclastic density currents on Campi Flegrei topography: A tool for statistical hazard estimation. J. Volcanol. Geotherm. Res. 132, 1–14 (2004).
Article ADS CAS Google Scholar
Gottsmann, J., Camacho, A. G., Tiampo, K. F. & Fernández, J. Spatiotemporal variations in vertical gravity gradients at the Campi Flegrei caldera (Italy): A case for source multiplicity during unrest?. Geophys. J. Int. 167, 1089–1096 (2006).
Article ADS Google Scholar
Bodnar, R. J. et al. Quantitative model for magma degassing and ground deformation (bradyseism) at Campi Flegrei, Italy: Implications for future eruptions. Geology 35, 791 (2007).
Article ADS CAS Google Scholar
Costa, A. et al. Tephra fallout hazard assessment at the Campi Flegrei caldera (Italy). Bull. Volcanol. 71, 259–273 (2009).
Article ADS Google Scholar
Lima, A. et al. Thermodynamic model for uplift and deflation episodes (bradyseism) associated with magmatic–hydrothermal activity at the Campi Flegrei (Italy). Earth-Sci. Rev. 97, 44–58 (2009).
Article ADS Google Scholar
Rinaldi, A. P., Todesco, M. & Bonafede, M. Hydrothermal instability and ground displacement at the Campi Flegrei caldera. Phys. Earth Planet. Inter. 178, 155–161 (2010).
Article ADS Google Scholar
Chiodini, G., Caliro, S., De Martino, P., Avino, R. & Gherardi, F. Early signals of new volcanic unrest at Campi Flegrei caldera? Insights from geochemical data and physical simulations. Geology 40, 943–946 (2012).
Article ADS Google Scholar
Petrillo, Z. et al. Defining a 3D physical model for the hydrothermal circulation at Campi Flegrei caldera (Italy). J. Volcanol. Geotherm. Res. 264, 172–182 (2013).
Article ADS CAS Google Scholar
Ashley, R. P. & Lloyd, J. W. An example of the use of factor analysis and cluster analysis in groundwater chemistry interpretation. J. Hydrol. 39, 355–364 (1978).
Article ADS CAS Google Scholar
Wenning, R. J. & Erickson, G. A. Interpretation and analysis of complex environmental data using chemometric methods. TrAC Trends Anal. Chem. 13, 446–457 (1994).
Article CAS Google Scholar
Johnson, R. A. & Wichern, D. Multivariate analysis. In Wiley StatsRef: Statistics Reference Online (eds Balakrishnan, N. et al.) 1–20 (Wiley, 2015). https://doi.org/10.1002/9781118445112.stat02623.pub2.
Filzmoser, P., Hron, K. & Reimann, C. Principal component analysis for compositional data with outliers. Environmetrics 20, 621–632 (2009).
Article MathSciNet Google Scholar
Aitchison, J. & Greenacre, M. Biplots of compositional data. J. R. Stat. Soc. Ser. C Appl. Stat. 51, 375–392 (2002).
Article MathSciNet MATH Google Scholar
Aitchison, J. The statistical analysis of compositional data. J. R. Stat. Soc. Ser. B Methodol. 44, 139–160 (1982).
MathSciNet MATH Google Scholar
Chayes, F. On correlation between variables of constant sum. J. Geophys. Res. 65, 4185–4193 (1960).
Article ADS Google Scholar
Pawlowsky-Glahn, V. & Buccianti, A. Compositional Data Analysis: Theory and Applications (John Wiley & Sons, 2011).
Book MATH Google Scholar
Pawlowsky-Glahn, V., Egozcue, J. J. & Tolosana-Delgado, R. Modeling and Analysis of Compositional Data (John Wiley & Sons, 2015).
Book Google Scholar
Aitchison, J. The Statistical Analysis of Compositional Data (Chapman and Hall, 1986).
Book MATH Google Scholar
Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G. & Barcelo-Vidal, C. Isometric logratio transformations for compositional data analysis. Math. Geol. 35, 279–300 (2003).
Article MathSciNet MATH Google Scholar
Egozcue, J. J. & Pawlowsky-Glahn, V. Groups of parts and their balances in compositional data analysis. Math. Geol. 37, 795–828 (2005).
Article MathSciNet MATH Google Scholar
Van den Boogaart, K. G. & Van Tolosana-Delgado, R. Analyzing Compositional Data with R Vol. 122 (Springer, 2013).
Book MATH Google Scholar
Jolliffe, I. T. & Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 374, 20150202 (2016).
ADS MathSciNet MATH Google Scholar
Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417 (1933).
Article MATH Google Scholar
Eckart, C. & Young, G. The approximation of one matrix by another of lower rank. Psychometrika 1, 211–218 (1936).
Article MATH Google Scholar
Gabriel, K. R. The biplot graphic display of matrices with application to principal component analysis. Biometrika 58, 453–467 (1971).
Article MathSciNet MATH Google Scholar
Greenacre Michael J. & Leslie G. Underhill Scaling a data matrix in a low-dimensional Euclidean space. Topics in Applied Multivariate Analysis, Hawkins, DM, ed., Cambridge UK: Cambridge University Press, 1982. Scaling a data matrix in a low-dimensional Euclidean space. In Applied Multivariate Analysis (Cambridge University Press, 1982).
Bacon-Shone, J., Buccianti, A., Mateu-Figueras, G. & Pawlowsky-Glahn, V. (eds) Compositional Data Analysis in the Geosciences: From Theory to Practice (2008).
Chiodini, G. et al. Hydrothermal pressure-temperature control on CO₂ emissions and seismicity at Campi Flegrei (Italy). J. Volcanol. Geotherm. Res. 414, 107245 (2021).
Article CAS Google Scholar
Cioni, R., Corazza, E. & Marini, L. The gas/steam ratio as indicator of heat transfer at the Solfatara fumaroles, Phlegraean Fields (Italy). Bull Volcanol 47, 295–302 (1984).
Article ADS CAS Google Scholar
Chiodini, G. et al. CO₂ degassing and energy release at Solfatara volcano, Campi Flegrei. Italy. J. Geophys. Res. Solid Earth 106, 16213–16221 (2001).
Article ADS Google Scholar
Chiodini, G. et al. Magma degassing as a trigger of bradyseismic events: The case of Phlegrean Fields (Italy). Geophys. Res. Lett. 30, 1434 (2003).
Article ADS Google Scholar
Chiodini, G., Avino, R., Caliro, S. & Minopoli, C. Temperature and pressure gas geoindicators at the Solfatara fumaroles (Campi Flegrei). Ann. Geophys. 54, 4 (2011).
Article Google Scholar
Chiodini, G. et al. Clues on the origin of post-2000 earthquakes at Campi Flegrei caldera (Italy). Sci. Rep. 7, 4472 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Filzmoser, P. & Hron, K. Outlier detection for compositional data using robust methods. Math. Geosci. 40, 233–248 (2008).
Article MATH Google Scholar
Filzmoser, P., Hron, K. & Reimann, C. Interpretation of multivariate outliers for compositional data. Comput. Geosci. 39, 77–85 (2012).
Article ADS Google Scholar
Giggenbach, W. F. Geothermal gas equilibria. Geochim. Cosmochim. Acta 44, 2021–2032 (1980).
Article ADS CAS Google Scholar
D’Amore, F. & Panichi, C. Evaluation of deep temperatures of hydrothermal systems by a new gas geothermometer. Geochim. Cosmochim. Acta 44, 549–556 (1980).
Article ADS Google Scholar
Tramelli, A. et al. Statistics of seismicity to investigate the Campi Flegrei caldera unrest. Sci. Rep. 11, 1–10 (2021).
Article Google Scholar
Petrillo, Z. et al. A perturbative approach for modeling short-term fluid-driven ground deformation episodes on volcanoes: A case study in the Campi Flegrei caldera (Italy). J. Geophys. Res. Solid Earth 124, 1036–1056 (2019).
Article ADS Google Scholar
Gutenberg, B. & Richter, C. F. Seismicity of the Earth and Associated Phenomena (Princeton University Press, 1954).
Google Scholar
Wiemer, S. Minimum magnitude of completeness in earthquake catalogs: Examples from Alaska, the Western United States, and Japan. Bull. Seismol. Soc. Am. 90, 859–869 (2000).
Article Google Scholar
Petrosino, S., Cusano, P. & Madonia, P. Tidal and hydrological periodicities of seismicity reveal new risk scenarios at Campi Flegrei caldera. Sci. Rep. 8, 13808 (2018).
Article ADS PubMed PubMed Central Google Scholar
Petrosino, S. & De Siena, L. Fluid migrations and volcanic earthquakes from depolarized ambient noise. Nat. Commun. 12, 6656 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Tramelli, A., Giudicepietro, F., Ricciolino, P. & Chiodini, G. The seismicity of Campi Flegrei in the contest of an evolving long term unrest. Sci. Rep. 12, 2900 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Gordon, A. D. Classification (CRC Press, 1999).
Book MATH Google Scholar
Chiodini, G. & Marini, L. Hydrothermal gas equilibria: the H₂O-H₂-CO₂-CO-CH₄ system. Geochim. Cosmochim. Acta 62, 2673–2687 (1998).
Article ADS CAS Google Scholar
Chiodini, G. CO₂/CH₄ ratio in fumaroles a powerful tool to detect magma degassing episodes at quiescent volcanoes: CO₂/CH₄ in fumaroles and magma degassing. Geophys. Res. Lett. 36, (2009).
Jolliffe, I. T. A note on the use of principal components in regression. Appl. Stat. 31, 300 (1982).
Article Google Scholar

Download references

Acknowledgements

This study was partially supported by the Progetto Strategico Dipartimentale INGV 2019 “Linking surface Observables to sub–Volcanic plumbing-system: A multidisciplinary approach for Eruption forecasting at Campi Flegrei caldera (Italy)”, LOVE-CF and by the INGV Institutional funds: Ricerca Libera 2021. The data analysed in this study were provided by Unità Funzionale Geochimica dei fluidi, Istituto Nazionale di Geofisica e Vulcanologia (Osservatorio Vesuviano).

Author information

Authors and Affiliations

Istituto Nazionale di Geofisica e Vulcanologia (INGV), Osservatorio Vesuviano, Via Diocleziano 328, Napoli, Italy
Zaccaria Petrillo, Annarita Mangiacapra, Sergio Scippacercola & Stefano Caliro
Dipartimento di Scienze della Terra e Geoambientali, Università degli Studi di Bari Aldo Moro, via Orabona 4, Bari, Italy
Simona Tripaldi
Istituto Nazionale di Geofisica e Vulcanologia, Sezione di Bologna, Via Donato Creti, 12, 40128, Bologna, Italy
Giovanni Chiodini

Authors

Zaccaria Petrillo
View author publications
Search author on:PubMed Google Scholar
Simona Tripaldi
View author publications
Search author on:PubMed Google Scholar
Annarita Mangiacapra
View author publications
Search author on:PubMed Google Scholar
Sergio Scippacercola
View author publications
Search author on:PubMed Google Scholar
Stefano Caliro
View author publications
Search author on:PubMed Google Scholar
Giovanni Chiodini
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.P., S.T., A.M. and S.S. contributed to the conception of the study; Z.P., S.T., A.M. and S.S. wrote the paper, S.C. and G.C. contributed to the data interpretation; S.C. contributed to the acquisition and to the data analysis; all authors revised the manuscript; all authors approved the submitted version of the manuscript.

Corresponding author

Correspondence to Annarita Mangiacapra.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Petrillo, Z., Tripaldi, S., Mangiacapra, A. et al. Principal component analysis on twenty years (2000–2020) of geochemical and geophysical observations at Campi Flegrei active caldera. Sci Rep 13, 18445 (2023). https://doi.org/10.1038/s41598-023-45108-0

Download citation

Received: 20 December 2022
Accepted: 16 October 2023
Published: 27 October 2023
Version of record: 27 October 2023
DOI: https://doi.org/10.1038/s41598-023-45108-0

This article is cited by

Via entropy analysis (EA) and systematic cluster analysis (SCA), multi-index chemometric analysis of 20 amino acids, trace elements, etc., of five green teas from Guizhou, China
- Libing Zhou
- Chunli Huang
BMC Chemistry (2025)
Insights from b value analysis of Campi Flegrei unrests
- Vincenzo Convertito
- Cataldo Godano
- Anna Tramelli
Scientific Reports (2025)
Dynamics of the Neapolitan Volcanoes Inferred from Tiltmeter and Seismic Data Analysis: A Review
- M. Falanga
- I. Aquino
- C. Ricco
Pure and Applied Geophysics (2025)