Introduction

The cost of living and quality of life are two interconnected concepts, as evaluating people’s ability to access goods and services makes it possible to determine whether they believe their expectations for a satisfactory life are being met. The cost of living is understood as the expenses incurred to maintain minimum living conditions for individuals1. To the extent that the goods and services acquired by a person translate into satisfied needs, their perception of their quality of life can be measured. Quality of life, conceived as an individual’s interpretation of achieving goals, meeting life expectations, and its correspondence with their current situation2, has been addressed in the literature in various contexts (3,4). Measuring quality of life could be considered problematic due to the subjective nature of its components; however, measurable objective aspects have also been proposed for its study and broader understanding5.

The study of the relationship between cost of living and quality of life has been a topic of growing interest in socio-economic research and public policy formulation, as these indicators allow for the assessment of population well-being and access to essential goods and services6,7. In a world characterized by demographic, economic, and environmental changes, understanding how these metrics vary across different regions and what factors determine their differences becomes crucial for decision-making at both governmental and corporate levels8,9.

Recent advances in three-way data analysis, such as temporal, spatial, and indicator-wise dimensions, have enabled more nuanced examinations of such complex relationships10,11. Traditional two-way approaches, such as Principal Component Analysis (PCA) or regression models, often fail to capture the inherent multidimensionality of socio-economic data, where interactions between countries, time periods, and variables must be analyzed simultaneously. Three-way methods, such as the Tucker3 model, address this by decomposing data tensors into interpretable components while preserving interactions across modes12,13. This is particularly relevant for studies on cost of living and quality of life, where dynamic trends such as inflation and policy changes, as well as regional disparities, give rise to complex data structures that necessitate the use of multidimensional analytical tools14.

Cost of living indices have been widely used to measure the economic accessibility of goods and services in different countries and regions. By collecting information on the prices of essential products and average wages, these indices allow for the comparison of purchasing power across different territories, albeit with methodological limitations that have been pointed out in multiple studies15,16. However, heterogeneity within countries can be significant, as costs vary not only between countries but also between regions and socio-economic strata within the same nation17,18.

On the other hand, quality of life indices incorporate broader dimensions, such as access to education, healthcare, security, and subjective well-being perception19,20. These indices seek to capture well-being beyond the economic dimension and have been used in various studies to evaluate human development and social equity21,22. However, their measurement is complex and often influenced by political and methodological factors that can affect the interpretation of results23,24.

Over the past decades, multivariate statistics have played a fundamental role in the analysis of these indices, providing advanced tools for modeling multidimensional data and detecting patterns in complex datasets. In particular, techniques such as Principal Component Analysis (PCA), Factor Analysis (FA), HJ-Biplot25, Tucker3 decomposition, and regression models, among others, have been applied in studies on quality of life and cost of living to reduce dimensionality and obtain interpretable visual representations of the data26,27,28. Following a comprehensive review of the literature, to the best of our knowledge, the technique of disjoint components has not been utilized to analyze three-way data related to economic variables, sustainability, nor in the study of cost of living and quality of life indices. Table 1 presents a summary of recent studies on cost of living and quality of life analysis, highlighting the statistical techniques used and their main findings.

Table 1 Summary of recent studies on cost of living and quality of life.

The adoption of three-way analysis is particularly timely given the increasing availability of longitudinal and multilevel datasets in socio-economics42. For example, the interplay between cost of living shocks, such as those experienced during the COVID-19 pandemic, and quality of life perceptions can only be fully understood through methods that account for temporal and cross-sectional variability simultaneously43,44. Our work bridges this gap by proposing a unified framework combining Tucker3 decomposition with dynamic HJ-Biplot visualization.

The objective of this work is to propose a new framework for analyzing three-way data related to cost of living and quality of life, particularly within the context of sustainability research. To demonstrate the advantages of our approach, we conducted a case study focusing on Argentina, Brazil, Canada, Chile, Colombia, Ecuador, United States, Mexico, Peru, and Uruguay. Our analysis incorporates both subjective perception variables such as security and healthcare services and objective factors, including purchasing power, price-to-income ratio, and pollution index, to provide a more comprehensive understanding of the phenomenon under investigation.

The analysis covers the period from 2019 to 2024. The main advantage of our framework lies in the ease of interpretation achieved through the use of disjoint components in the loading matrices of both the Tucker3 model and the Parafac model. These results are further complemented by a dynamic HJ-Biplot analysis. The integration of these methods provides a more comprehensive perspective on the analyzed data tensor.

The remainder of this article is structured as follows. In “Materials and methods” section, we describe the data sources utilized, the data selection process, and the construction of the three-way table. We also present an overview of the Tucker3 and Parafac models, along with their disjoint components and the dynamic HJ-Biplot analysis. This section concludes with a presentation of the proposed methodology. Section “Results and discussion” is dedicated to the case study and the application of the framework. We determine the number of components for each mode of the Tucker3 model and calculate the respective components. Subsequently, a dynamic HJ-Biplot analysis is conducted, alongside an analysis of the Parafac model, followed by interpretations and corresponding contrasts, which include the development of tables and plots. Finally, in “Conclusions” section, we present our findings, highlighting the advantages but also some limitations of the methodology that we propose. Additionally, we draw conclusions related to the case study and discuss potential future work.

Materials and methods

Data

For the case study of this manuscript we have obtained the data from a website that belongs to the Numbeo organization. The main webpage can be accessed via the following link: https://es.numbeo.com. Numbeo offers up-to-date information on countries (and cities) related to cost of living and quality of life. With the data from Numbeo we have built a three-way table (or tensor). In each way we have a mode. The mode is defined as the set of entities that belong to a particular way or dimension. The modes of our study can be seen in Table 2. Additionally, in Table 3, we can see the meaning of each variable in the study and the way it was measured.

Table 2 Modes of the three-way table utilized in the case study.
Table 3 Interpretation of the variables.

Theoretical background

The statistical characterization of cost of living and quality of life indices is achieved through a methodology that integrates the Tucker3 model, dynamic HJ-Biplot analysis, and a variant of Tucker3 decomposition that utilizes disjoint components. The mathematical formulation of these methods enables the analysis of data organized in three dimensions, facilitating the interpretation of latent structures in complex datasets.

The dynamic HJ-Biplot analysis, developed in26, is an extension of the static HJ-Biplot25 that incorporates the temporal dimension through the use of tensors, allowing the simultaneous representation of individuals and variables in a lower-dimensional space over time while preserving the maximum variability of the data. Formally, if \(\underline{\textbf{X}}\) represents a data tensor structured in three modes (individuals, variables, and time), the singular value decomposition (SVD) applied at each instant t is expressed as

$$\begin{aligned} \textbf{X}_t = \textbf{U}_t \mathbf {\Lambda }_t \textbf{V}_t^T, \end{aligned}$$

where \(\textbf{X}_t\) represents the t-th frontal slice of \(\underline{\textbf{X}}\), \(\textbf{U}_t\) and \(\textbf{V}_t\) are matrices of singular vectors, and \(\mathbf {\Lambda }_t\) is the diagonal matrix of singular values. This model allows for the identification of temporal patterns in the data through the construction of trajectories in the projected space. As we can see in26, if we consider the original tensor \(\underline{\textbf{X}}\) with dimensions \(I \times J \times K\), where I is the number of individuals, J the number of variables, and K the temporal situations, the matrix of configurations over time is obtained by extracting multiple two-dimensional representations for each variable from the three-way \(\underline{\textbf{X}}\). Let \(\textbf{Z}_j\) (\(1 \le j \le J\)) be the resulting matrices, each representing the evolution of a specific variable over time. The projection of \(\textbf{Z}_j\) in the HJ-Biplot space allows visualizing the evolution of variables across different temporal situations. Additionally, the representation of individuals is based on the projection of new points onto the reference configuration using \(\textbf{o} = (\textbf{B}^{\textrm{T}} \textbf{B})^{-1} \textbf{B}^{\textrm{T}} \textbf{x}\), where \(\textbf{B}\) contains the variable vectors, \(\textbf{x}\) represents the individual’s measurements with respect to the original variables, and \(\textbf{o}\) holds the coordinates in the corresponding HJ-Biplot. This procedure generalizes the projection of multiple observations stored in matrix \(\textbf{Z}_j\), enabling a dynamic representation of individuals and variables over time.

The ability of the dynamic HJ-Biplot to model three-way data allows for a direct comparison with the Tucker3 model, introduced in45,46. This model decomposes a tensor \(\underline{\textbf{X}}=(x_{ijk})\) of order \(I \times J \times K\) in terms of three loading matrices \(\textbf{A}=(a_{ip})\) of order \(I \times P\), \(\textbf{B}=(b_{jq})\) of order \(J \times Q\), and \(\textbf{C}=(c_{kr})\) of order \(K \times R\) and an interaction core \(\underline{\textbf{G}}=(g_{pqr})\) of order \(P \times Q \times R\). Each element of the data tensor is given by

$$\begin{aligned} x_{ijk} = \sum _{p=1}^{P} \sum _{q=1}^{Q} \sum _{r=1}^{R} g_{pqr} a_{ip} b_{jq} c_{kr} + e_{ijk}, \end{aligned}$$

where \(e_{ijk}\) is the error term. We can see in47,48 methods for selecting the number of components P, Q, and R for each mode, which are based on building a convex hull. The Parafac model, proposed in49,50, defines each element of the three-way table as follows

$$\begin{aligned} x_{ijk} = \sum _{r=1}^{R} a_{ir} b_{jr} c_{kr} + e_{ijk}, \end{aligned}$$

where \(\textbf{A} = (a_{ir})\) of order \(I \times R\) is the loading matrix in the first mode, \(\textbf{B} = (b_{jr})\) of order \(J \times R\) is the loading matrix in the second mode, and \(\textbf{C} = (c_{kr})\) of order \(K \times R\) is the loading matrix in the third mode, with R representing the number of components in each mode and \(e_{ijk}\) as the error term.

To improve the interpretability of the components in the loading matrices, a modified version of the Tucker3 model based on disjoint components, as proposed in51, is incorporated. In the case of the Parafac model, we can see in52 a variant with disjoint components. The definition of the disjoint loading matrix, the optimization problems to be addressed, and the corresponding algorithms for calculating disjoint components are detailed in51,52.

The complexity of interpretation in a loading matrix arises when one or more entities have similar loadings (absolute value) in two or more components. In this situation, it is unclear which of these components best represents the entity. The use of disjoint components addresses this interpretative challenge, as each entity will have a non-zero loading in only one component, which will then serve as the representative for that entity. Tables 7, 9, and 11 provide examples of disjoint loading matrices.

The integration of these methods offers a comprehensive approach for analyzing economic and sustainability data in three dimensions, facilitating a clearer interpretation of the relationships between cost of living and quality of life indices across various cities or countries and time periods.

Proposed framework

In this section, we outline the framework proposed for analyzing three-way tables. An analyst studying data structured according to the Tucker3 model is advised to follow the steps detailed in Algorithm 1.

Algorithm 1
figure a

Proposed procedure for the statistical study with the Tucker3 model of cost of living and quality of life indices in countries of the American continent.

Before detailing our methodology, we will outline the key assumptions on which it is based: (i) the data can be decomposed into a linear structure using three matrices (one for each mode) and a central core capturing the interactions between the modes; (ii) linear relationships exist between the variables and the components of each mode; (iii) each mode consists of distinct sets of entities, meaning the three-way table represents three distinct modes; (iv) the data in each mode contain sufficient richness to make the decomposition into lower dimensions meaningful, allowing preserved variability to effectively capture relationships between entities. However, if the data lack structure or sufficient variability, dimensional reduction may offer limited value; and (v) the three-way table is properly constructed, the data have been appropriately extracted (low noise), and adequate treatment has been applied to missing values and outliers.

The procedure begins with the construction and assembly of the three-way table. Modes must be defined and associated with the dimensions. After that we must perform a pre-processing of the data. Although the data analyst may decide not to carry out any type of pre-processing, we recommend at least a centering and scaling of the data in the mode where the variables to be studied are located. In this way, differences in scale or measurement of the variables will not affect the interpretation of the results53,54. Centering eliminates bias caused by differences in mean values across variables, while scaling ensures that variables with larger variances do not disproportionately influence the results. This pre-processing step is particularly important in the context of three-way tables, where the heterogeneity between modes could obscure meaningful patterns or structures. By centering and scaling, we aim to enhance the comparability and interpretability of the results55,56.

The next step is to determine the most suitable number of components for each mode. Some of these methods can be seen in47,48,57. Once we know how many components we are going to use for each mode, the next task to perform is the computation of the loading matrices. In practice, it is difficult to find interpretable loading matrices. Therefore, there are methods to implement rotations of the components. These methods maintain the fit of the model and try to find interpretable components. In55 we can see some rotation methods for the Tucker3 model and in56 a package in R that allows us to perform the calculations. If after the rotations we still have uninterpretable loading matrices (this can happen) there are sparse methods that sacrifice some fit in the model but provide interpretable components. Some of these sparse methods for three-way tables can be seen in58,59,60. However, we recommend the use of disjoint components in our methodology. Models with disjoint components have their origins in matrices (or two-way tables), as we can see in61,62. It can be seen in63,64 some studies where authors have used disjoint components for matrices. For three-way tables, disjoint components are proposed in51,52. For the calculation of disjoint components we suggest using Algorithm 2.

Algorithm 2
figure b

Procedure for calculating disjoint components in the loading matrices A, B and C with the Tucker3 model.

If we compute disjoint components in a loading matrix, the model fit is affected. Losing fit is the price to pay for gaining interpretation. This happens with all sparse techniques. The more disjoint components the loading matrices have, the more the model fit is impacted. We therefore recommend calculating disjoint components for all combinations of loading matrices. After that, the model that achieves the best balance between interpretation and fit must be chosen, which is left to the discretion of the data analyst. Table 6 is an example of what is proposed in Algorithm 2. If a loading matrix has disjoint components, interpretation becomes easier because each mode entity will belong to one and only one component.

Comparative analysis with existing methods

In this section, we carry out an analysis comparing our proposed approach with some of the most widely recognized existing methods for three-way table analysis. The proposed framework is specifically designed for analyzing three-way tables, with an emphasis on cost of living and quality of life data. Our approach is designed to allow the data analyst, for each mode, to utilize the components for grouping the original entities unambiguously and, ultimately, for characterizing these components. Accordingly, the validity and reliability of our proposed framework are firmly rooted in its core design: the disjoint components. The computation of disjoint components, as described above, allows the interpretation of the data.

Our framework is structured in two stages. The first stage focuses on enabling the researcher to classify all entities across the three modes using disjoint components. The second stage is designed to help the researcher simultaneously contrast and complement the results obtained in the first stage through a dynamic HJ-Biplot. In Table 4, we present a comparative summary between disjoint components and other types of components in both the Parafac and Tucker3 models. However, as shown in Algorithm 1, we do not rule out the use of other types of components. For instance, if the desired data interpretability can be achieved with classical components, there would be no need to compute disjoint components. Our priority is always to preserve the fit of the model. If classical components are insufficient, we recommend using rotated components. Ultimately, we recommend computing disjoint components, recognizing that this would logically entail sacrificing some degree of the model’s fit.

Table 4 Comparison with existing component-based methods for three-way tables.

STATIS methods for three-way tables aim to identify a common structure among several related tables. By integrating information from all the tables, these methods uncover shared patterns and provide a clear representation of the connections between the different dimensions of the data. STATIS methods are specifically designed to compare and integrate multiple tables, whereas in our proposed approach, we use components to reduce the dimensionality of the modes. Although the STATIS family does not guarantee data interpretability in the sense defined in this manuscript, it can be integrated into our methodology as a replacement for the dynamic HJ-Biplot, serving as a contrasting method to the disjoint components. In27,68,69,70, we can observe the application of STATIS methods for the analysis of three-way tables. Three-Way Clustering methods can be used to create clusters in each of the three modes of a three-way table. These methods are specifically designed to analyze data structured in three dimensions and allow for the simultaneous identification of patterns or groups within each mode. It is important to note that the relationships between the modes are considered to optimize the clustering process, meaning that clusters in one mode are influenced by the groupings in the other modes. In Three-Way Clustering methods, it is possible for an entity from any of the modes to belong to multiple clusters, particularly when applying soft or fuzzy clustering approaches. These methods assign entities to clusters based on degrees of membership or probabilities, allowing for more nuanced representations of ambiguous relationships. In contrast, hard clustering methods enforce strict and exclusive assignments, where each entity belongs to only one cluster. In this case, data interpretability, as defined in this manuscript, is achieved. Therefore, the use of hard clustering methods could be integrated into our framework as a substitute for disjoint components. In71,72,73,74,75, we can see the theory and applications of some Three-Way Clustering methods. At the end of “Conclusions” section, we outline future work focused on proposing new methods for the analysis of three-way tables.

Results and discussion

Determination of the number of components

The three-way table used for this statistical analysis can be found in the Supplementary Information files. For pre-processing, we performed centering and scaling in the second mode, where the variables are located. To calculate the number of components for each mode, the convex hull procedure is used, which we can see in47. To do this, we compute the sum of squares (residual) and the fit (in percentage), for all the models, varying the number of components P in the first mode from 1 to 5, the number of components Q in the second mode from 1 to 5, and the number of components R in the third mode from 1 to 4. The 100 models obtained can be found in the Supplementary Information files. Note that \(S=P+Q+R\) represents the total number of components in the model. Given that our three-way table has a size of \(10 \times 10 \times 6\) (10 countries, 10 variables, and 6 years), our intention in the experimental phase was to ensure that the number of components does not exceed half of the original entities in each mode. In constructing the case study, it was deemed impractical to analyze 10 real countries using, for instance, 7 or 8 latent countries. Following the same rationale for the variables and years, we decided to set the upper bounds at \(P=5\), \(Q=5\), and \(R=4\).

Figures 1 and 2 show the convex hull procedure plot when all models are placed on a plane, with the horizontal axis representing the number of components. Both plots show the polygonal of the convex hull from which we must select the model for the analysis with Tucker3. We must then choose between models 1, 25, 49, 50, 74, 98, 99 and 100. In particular, these eight models can be seen in Table 5. Although the model 49 which has seven components looks like the best model to choose, we have decided to select the model 50 which uses one more component in the third mode. We selected model 50, which utilizes two components in the third mode, for illustrative purposes. Our aim is to demonstrate how the disjoint components facilitate the grouping of the original entities across all modes. In addition, dividing the six years into two periods may provide valuable insights in the study.

Table 5 Models that are on the polygonal of the convex hull.
Fig. 1
figure 1

Convex hull procedure plot using the sum of squares (residual).

Fig. 2
figure 2

Convex hull procedure plot using the model fit.

Computation of the disjoint components

After selecting \(P=3\), \(Q=3\), and \(R=2\), we computed the loading matrices. As the matrices were not interpretable, we performed a VARIMAX rotation. However, after rotation, there was no significant improvement in the interpretation of the loading matrices. We then proceeded with the calculation of disjoint components in the loading matrices \(\textbf{A}\), \(\textbf{B}\) and \(\textbf{C}\), as we can see in Table 6.

Table 6 Comparison of fit between different models with disjoint components.

It can be seen in Table 6 that the calculation of disjoint components in loading matrices \(\textbf{A}\) and \(\textbf{B}\) is more expensive, in terms of fit, than the computation of disjoint components in matrix \(\textbf{C}\). It should be noted that as more modes utilize disjoint components, the model’s fit decreases. For the benefit of data analysis, we recommend at least calculating disjoint components in the mode containing the variables. However, the final decision is left to the discretion of the researcher. Now, observe that if we calculate disjoint components only in matrix \(\textbf{B}\), we lose \(9.0888\%\) of fit. If we calculate disjoint components in both loading matrices \(\textbf{A}\) and \(\textbf{B}\), we lose \(13.5393\%\) of fit. In other words, adding disjoint components in \(\textbf{A}\) results in an additional loss of \(4.4505\%\) of fit. Besides, computing disjoint components in all loading matrices results in a fit loss of \(13.8316\%\), if we compare with the original fit without disjoint components. After conducting a comparative analysis of the models with disjoint components, we decided to retain the model that uses disjoint components in matrices \(\textbf{A}\), \(\textbf{B}\), and \(\textbf{C}\), which has a fit of \(79.4953\%\).

The loading matrix \(\textbf{A}\) can be seen in Table 7, both with disjoint components and with non-disjoint components. It is important to note that, for non-disjoint components, the interpretation becomes complicated for countries like Colombia, Mexico, Peru and Uruguay due to the similar loadings across different components. However, with disjoint components the interpretation becomes very simple. Each disjoint component is a latent variable that can be characterized, allowing us to group countries. Figure 3 shows the country space using disjoint components. We can observe the presence of three clearly defined groups of countries due to the use of disjoint components. Table 8 shows how the countries have been grouped and the characterization of the disjoint components in the first mode.

Table 7 Loading matrix \(\textbf{A}\) in the Tucker3 model with three components.
Table 8 Disjoint components characterized for countries.
Fig. 3
figure 3

Plot of the reduced-dimensional space for the countries with disjoint components.

The loading matrix \(\textbf{B}\) can be seen in Table 9 with both disjoint and non-disjoint components. The interpretation also becomes complicated in this second mode if we use non-disjoint components. It is difficult to determine the appropriate component for the variables Healthcare, Pollution, Groceries, and Restaurant Prices. Once more, disjoint components simplify interpretation. Figure 4 shows us the space of variables with disjoint components. Note that, with disjoint components, there are variables that have loadings with different signs on the same component. If this occurs, it means that the variables have opposite or contrary relationships within that component. We must keep this in mind when interpreting the loadings. The contrary signs in COMP1-B indicate that individuals with lower purchasing power can benefit from high-quality public transportation services. Component COMP2-B reflects an urban environment characterized by high levels of security and low pollution, but also by elevated food prices, both in markets and restaurants. Component COMP3-B suggests that people have a higher relative income and can afford higher rents, but experience inadequate healthcare quality and less favorable climate conditions. Table 10 shows the characterization of the disjoint components in the second mode.

Table 9 Loading matrix \(\textbf{B}\) in the Tucker3 model with three components.
Table 10 Disjoint components characterized for variables (cost of living and quality of life).
Fig. 4
figure 4

Plot of the reduced-dimensional space for the variables with disjoint components.

Table 11 shows the loading matrix \(\textbf{C}\) with disjoint components and with non-disjoint components. With non-disjoint components, it becomes challenging to divide the years into two groups because some years exhibit similar loadings (in absolute value). However, within this mode, the non-disjoint components provide a more detailed and informative perspective. Notably, the non-disjoint COMP2-C component assigns quantitative weights for the years, ranking them from the “most challenging” year, 2020 (− 0.5048362), to the “most promising recovery,” 2024 (0.7227627). Additionally, the year 2022 is characterized by a “weak economic recovery” (0.0131600). Despite the utility and importance of disjoint components, we recommend always computing classical components, as they can provide valuable information that complements the conclusions drawn from the disjoint components.

Table 11 Loading matrix \(\textbf{C}\) in the Tucker3 model with two components.

Using disjoint components, we can conclude that the years 2023 and 2024 are represented in COMP1-C, while the years 2019, 2020, 2021, and 2022 are represented by COMP2-C. Figure 5 shows the space of the years. The characterization of the disjoint components in the third mode can be seen in Table 12. The United States and Canada are considered the countries with the strongest economies in the American continent. Both have high per capita incomes, developed infrastructure, and are leaders in various economic sectors. Many other countries in the continent depend heavily on the economies of these two countries. The trade conflicts between the United States and China, the global economic slowdown, inflation, Brexit, and the decline of the US dollar affected the economy of the American continent in 2019. Then came three terrible years (2020, 2021, and 2022) during which the global economy collapsed, including obviously the economies of the countries in our study. Note that the years 2019, 2020, 2021, and 2022 are represented in one component, while the years 2023 and 2024 are represented in another component. Logically, the years 2023 and 2024 witnessed a global economic recovery.

Table 12 Disjoint components characterized for years.
Fig. 5
figure 5

Plot of the reduced-dimensional space for the years with disjoint components.

In Table 13 we can see the core \(\underline{\textbf{G}}\) of the Tucker3 model with disjoint components in all loading matrices. In the core, we can observe the weights of the interactions between the components of the three modes. Interactions with absolute weights exceeding 4.7 are considered the most significant. For each group of countries, the analysis will focus on the most relevant interactions. It should be noted that these interactions have a negative sign, therefore, we have inverted the sign of the corresponding component of the second mode (the variables’ mode) to ensure an accurate interpretation.

A detailed analysis of Table 13 reveals that, prior to the pandemic, the weights of interactions exhibited significantly higher absolute values compared to the post-pandemic period. These findings clearly demonstrate the impact of COVID-19. For instance, in the first group of countries, a comparison of weights \(-9.0994\) and \(-6.5583\) underscores this trend. In the second group, weights of \(-7.3818\) and \(-6.3407\) illustrate this shift. Finally, in the third group, comparing weights of \(-7.4467\) and \(-5.4902\) further confirms our assertion. Our disjoint component model was able to detect what stands out the most in each group of countries. Note that any analysis conducted using the core and loading matrices is relative, meaning it is performed in a comparative context among the countries included in the study.

Table 13 Core \(\underline{\textbf{G}}\) in the Tucker3 model with three disjoint loading matrices.

In Chile, Colombia, Ecuador, Mexico, and Uruguay, during the period of economic crisis, there was access to quality hospital services and generally good climatic conditions. Furthermore, rental prices remained relatively stable, as did the relationship between individuals’ income and the prices of goods and services. Note that, during the years of economic recovery, this situation persists, but with a moderate recovery in the economy.

In Canada and the United States, despite the pandemic’s impact, both countries maintain a strong and stable economy. Overall, individuals retained a high purchasing power. However, transportation services were adversely affected. This trend persisted, and the economic recovery was swift.

Argentina, Brazil, and Peru were among the countries most severely impacted by the coronavirus. These nations faced substantial security challenges and elevated levels of environmental pollution. While food prices remained relatively low, the economy experienced a sluggish recovery.

Next, we proceed with the calculation of three components in the Parafac model. When traditional components are computed, the model fit reaches \(93.59\%\). In contrast, calculating disjoint components exclusively in loading matrix \(\textbf{B}\) results in a model fit of \(83.98\%\). Consequently, this approach results in a \(9.61\%\) reduction in model fit.

The degeneracy problem in the Parafac model arises when two components exhibit a strong inverse correlation. To identify this, the component correlation matrix is typically computed. If any value in this matrix approaches − 1, it suggests that the solution may be degenerate. To address this issue, orthogonality constraints are commonly imposed on the model, particularly within one of the loading matrices55. In Table 14, the component correlation matrix is presented, indicating the potential presence of the degeneracy problem in the solution due to the correlation of approximately \(-0.7\) between components COMP1 and COMP3.

Table 14 Component correlation matrix in the Parafac model.

The primary advantage of disjoint components lies in the ease of interpretation they provide to researchers. A secondary benefit in the Parafac model is their ability to eliminate the degeneracy problem in the solution. The calculation of a disjoint loading matrix results in a structure where the columns are composed of orthonormal vectors. This approach yields two simultaneous benefits when calculating disjoint components in the Parafac model52. In Table 15, we present the loading matrix \(\textbf{B}\), showcasing both disjoint and non-disjoint components. By comparing this with the loading matrix \(\textbf{B}\) from the Tucker3 model displayed in Table 9, we can conclude that the same insights are achieved.

Table 15 Loading matrix \(\textbf{B}\) in the Parafac model with three components.

Analysis of the dynamic HJ-Biplot

In this section, we will utilize a dynamic HJ-Biplot to compare the results obtained from the Tucker3 model and the disjoint components. In Fig. 6, we observe the static HJ-Biplots corresponding to each of the six years of the study, spanning from 2019 to 2024. These plots enable a detailed analysis of the events that occurred in each individual year. Figure 7a presents the dynamic HJ-Biplot for the countries in the study, while Fig. 7b displays the dynamic HJ-Biplot for the considered variables. In both instances, 2019 has been taken as the reference year from which the dynamics are generated.

Fig. 6
figure 6

HJ-Biplot for each year of the case study (2019–2024).

Fig. 7
figure 7

Dynamic HJ-Biplot for the countries and variables in the case study, with 2019 serving as the reference year.

The first two components explain \(83.77\%\) of the variability in the data, which is quite substantial. The first component is the most significant, capturing \(71.33\%\) of the data variability, while the second component represents \(12.44\%\). To initiate the analysis, it is noteworthy that the clustering of countries suggested by the dynamic HJ-Biplot aligns with the results obtained from the Tucker3 model.

Now it is important to characterize the components (axes) and interpret the results. Axis 1 (Horizontal) is characterized as “Economic Conditions and Cost of Living”: On the right, Canada and the United States are associated with Purchasing Power, Rent, and Restaurant Prices, indicating a higher purchasing capacity and elevated living costs. On the left, countries such as Argentina, Brazil, and Peru are more aligned with Price Income Ratio and Climate, suggesting economies with lower purchasing power and a greater impact of costs on quality of life. The presence of economic variables at the extremes implies that this axis distinguishes countries based on their financial situation and standard of living. Axis 2 (Vertical) is characterized as “Quality of Life and Environmental Factors”: At the top, variables such as Healthcare and Climate are more closely associated with countries that demonstrate better access to health services and favorable climatic conditions, such as Colombia, Ecuador, and Mexico. At the bottom, Pollution and Transportation are associated with countries like Peru and Brazil, indicating that these regions may face greater challenges in these areas. This axis appears to separate countries according to their infrastructure and social well-being. In conclusion, the proposed labels reflect how countries are grouped based on economic factors along the horizontal axis and quality of life factors along the vertical axis, enabling an interpretation of each country’s trends in relation to their development and well-being.

From a temporal perspective, if we examine the initial positions of the countries and variables in 2019 and compare them with the positions in subsequent years, the impact of the COVID-19 pandemic becomes evident. Despite this, the overall structure remains relatively unchanged. This was detected by the core of the Tucker3 model, where the weights of the interactions were correspondingly high in absolute value both before and after the pandemic, as we can see in Table 13.

In model 46, a fit of \(85.114\%\) is achieved. When calculating disjoint components in the loading matrix \(\textbf{B}\) for this model, we obtain the matrix presented in Table 16, resulting in a model fit of \(75.07\%\). We calculated two disjoint components to enable a more accurate comparison, as an HJ-Biplot also utilizes two components to project individuals and objects onto a plane.

Table 16 Loading matrix \(\textbf{B}\) in the Tucker3 model with two disjoint components.

Note that in the first component of the dynamic HJ-Biplot, the variables that most strongly represent it with positive loadings are Groceries, Restaurant Prices, and Purchasing Power. In contrast, the variables Pollution and Transportation have a considerable negative loading in the same component. A similar pattern is observed in component COMP1-B of Table 16. If we now compare the second component of the dynamic HJ-Biplot with component COMP2-B, we can note that in both cases, the most prominent positive loading is associated with the variable Healthcare, while the most significant negative loading is attributed to the variable Rent. We can assert that both multivariate techniques yield equivalent interpretations.

Software used in computational experiments

Table 17 presents all the software elements utilized in the implementation of the software responsible for performing the calculations conducted during the computational experiments. Towards the end of “Conclusions” section, we highlight an important direction for future work, which involves making our implementation accessible to the broader research community. This step aims to foster collaboration and encourage other researchers to build upon our proposal, thereby advancing its development and application.

Table 17 Software resources.

Conclusions

This manuscript presents a new framework for analyzing a three-way table that integrates disjoint principal component analysis with the Tucker3 and Parafac models, along with a dynamic HJ-Biplot analysis. Algorithm 1 outlines the steps we recommend applying to economic studies related to cost of living and quality of life indices. To illustrate the use of our proposal, we carried out a statistical analysis with ten countries from the Americas, ten variables, and a period of six years (2019–2024).

Regarding the framework, the positive points to highlight are: (i) disjoint components in the Tucker3 and Parafac models enhance the interpretability of the loading matrices. When applied to a specific mode, each entity in that mode is represented by a single component, enabling clear and unambiguous associations between entities and components; (ii) the use of classical components should not be disregarded, as they can offer complementary information to the disjoint components, as demonstrated in the years mode of our case study; (iii) the use of a dynamic HJ-Biplot enables the study of how individuals and variables behave in the same plane across the entities of the third mode (which can generally represent situations, locations, or time periods); (iv) the implementation of a comparison between both methods facilitates the evaluation of results and ensures a more accurate interpretation of the data; and (v) we employ well-established methods, however, we propose a framework that holds significant value not only due to the selection of methods but, more importantly, because of the innovative manner in which these methods have been integrated. This integration approach highlights the potential of our framework to contribute to advancements in the field.

Regarding the case study, the following relevant conclusions were drawn: (i) countries were grouped into three clusters, with the United States, Colombia, and Brazil serving as representative nations for each respective group; (ii) The United States has economic strength, with its citizens enjoying high purchasing power and elevated levels of security, but facing issues in transportation services; (iii) Colombia stands out for its quality healthcare services and favorable climatic conditions, although it suffers from high levels of pollution; and (iv) Brazil is notable for the relative stability of food prices, citizens’ incomes, and housing rents despite the economic crisis caused by COVID-19. All countries were impacted by the pandemic, but some were able to withstand the crisis better than others, primarily due to their economic capacity and established infrastructure.

Some limitations of our methodology are: (i) there is a slight loss of fit in the model when calculating disjoint components; (ii) processing times and memory consumption increase when calculating disjoint components, especially with larger tensors; and (iii) an alternative method could be used for comparison other than the dynamic HJ-Biplot, such as one from the STATIS family, or a Three-Way Clustering method.

Future work to consider includes creating a library in both Python and R to enable other researchers to utilize our methodology. Additionally, conducting similar statistical studies on cost of living and quality of life indices, taking into account other countries or even performing a study focused on cities (such as capital cities around the world, for example). An important direction for future work is to develop a Disjoint STATIS that employs disjoint components and is suitable for the analysis of three-way tables. Additionally, the algorithms implemented in this study can be enhanced to improve their response times, allowing them to be applicable even to large datasets.

The highlights of our methodology are:

  • A novel statistical framework is proposed to facilitate the analysis of three-way data and the generation of insightful findings.

  • The new framework can be particularly applied to the analysis of economic variables related to cost of living and quality of life indices, providing valuable insights for understanding these critical aspects.

  • The core of the framework utilizes disjoint components within both the Tucker3 and Parafac models to examine the latent structure in the sample data, facilitating the interpretation of the loading matrices. Additionally, the results are contrasted with a dynamic HJ-Biplot analysis to derive well-founded conclusions.

  • The framework we propose also utilizes classical components, which should be employed to extract information that complements the insights provided by the disjoint components.

  • The use of disjoint components within a single loading matrix in the Parafac model implicitly enforces orthogonality conditions, thereby eliminating the degeneracy problem and facilitating the attainment of an interpretable trilinear solution.

The highlights of our case study are:

  • The application of the new framework to the case study facilitated the identification of three distinct groups of countries, enabling a comprehensive analysis of the similarities and differences among them through three latent variables during the economic recession caused by COVID-19 and the subsequent recovery period.

  • The countries are categorized into the following groups: (Moderate economic stability and progressive recovery) Chile, Colombia, Ecuador, Mexico, Uruguay; (Economic resilience and rapid recovery) Canada, USA; (Prolonged economic recession and challenging recovery) Argentina, Brazil, Peru.

  • The variables are grouped as follows: (Accessibility and purchasing power) Transportation opposes Purchasing Power; (Safety, food, and contamination) Security, Restaurant Prices, Groceries oppose Pollution; (Cost of living, medical services, and environmental quality) Rent, Price-to-Income Ratio oppose Healthcare, Climate.

  • The disjoint components classified the study years into: (Period of economic crisis) 2019–2022 and (Period of economic recovery) 2023–2024. The classical components revealed an important detail: 2020 was evidently a terrible year-the worst in economic terms-and although there was a slight improvement in 2021, it was still a markedly poor year. However, while the disjoint components placed 2022 within the first group, the classical components indicate that there was a slight improvement in 2022, and overall, the economies of the countries exhibited signs of recovery.

  • Regarding the interactions between modes, the case study identified three distinct clusters of countries, with the United States, Colombia, and Brazil serving as representative nations for each group. The United States exhibited economic resilience, characterized by high purchasing power and security levels despite challenges in transportation services. Colombia stood out for its quality healthcare and favorable climatic conditions, though it faced significant pollution levels. Brazil demonstrated relative stability in food prices, income, and housing rents, even amidst the economic crisis caused by COVID-19. Although all countries were affected by the pandemic, their responses and recovery differed depending on economic capacity and infrastructure strength.