Explosive neural networks via higher-order interactions in curved statistical manifolds

Aguilera, Miguel; Morales, Pablo A.; Rosas, Fernando E.; Shimazaki, Hideaki

doi:10.1038/s41467-025-61475-w

Download PDF

Article
Open access
Published: 24 July 2025

Explosive neural networks via higher-order interactions in curved statistical manifolds

Nature Communications volume 16, Article number: 6511 (2025) Cite this article

22k Accesses
2 Citations
161 Altmetric
Metrics details

Subjects

Abstract

Higher-order interactions underlie complex phenomena in systems such as biological and artificial neural networks, but their study is challenging due to the scarcity of tractable models. By leveraging a generalisation of the maximum entropy principle, we introduce curved neural networks as a class of models with a limited number of parameters that are particularly well-suited for studying higher-order phenomena. Through exact mean-field descriptions, we show that these curved neural networks implement a self-regulating annealing process that can accelerate memory retrieval, leading to explosive order-disorder phase transitions with multi-stability and hysteresis effects. Moreover, by analytically exploring their memory-retrieval capacity using the replica trick, we demonstrate that these networks can enhance memory capacity and robustness of retrieval over classical associative-memory networks. Overall, the proposed framework provides parsimonious models amenable to analytical study, revealing higher-order phenomena in complex networks.

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Article Open access 30 August 2023

Open-loop analog programmable electrochemical memory array

Article Open access 04 October 2023

Efficient nonlinear function approximation in analog resistive crossbars for recurrent neural networks

Article Open access 29 January 2025

Introduction

Complex physical, biological, and social systems often exhibit higher-order interdependencies that cannot be reduced to pairwise interactions between their components^1,2. Recent studies suggest that higher-order organisation is not the exception but the norm, providing various mechanisms for its emergence^3,4,5,6. Modelling studies have revealed that higher-order interactions (HOIs) underlie collective activities such as bistability, hysteresis, and ‘explosive’ phase transitions associated with abrupt discontinuities in order parameters^{4,7,8,9,10,11}.

HOIs are particularly important for the functioning of biological and artificial neural systems. For instance, they shape the collective activity of biological neurons^12,13, being directly responsible for their inherent sparsity^5,13,14,15 and possibly underlying critical dynamics^16,17. HOIs have also been shown to enhance the computational capacity of artificial recurrent neural networks^18,19. More specifically, ‘dense associative memories’ with extended memory capacity^20,21,22,23 are realised by specific non-linear activation functions, which effectively incorporate HOIs. These non-linear functions are related to attention mechanisms of transformer neural networks²⁴ and the energy landscape of diffusion models^25,26, leading to the conjecture that HOIs underlie the success of these state-of-the-art deep learning models.

Despite their importance, existent studies of HOIs face significant computational challenges. Analytically tractable models that incorporate HOIs typically limit interactions to a single order (e.g., $p$-spin models^22,27,28). Otherwise, attempting to represent diverse HOIs exhaustively results in a combinatorial explosion²⁹. This issue is pervasive, restricting investigations of high-order interaction models—such as contagion⁹, Ising¹⁹, or Kuramoto³⁰ models—to highly homogeneous scenarios^3,16 or to models of relatively low-order^9,11,31. While attempts have been made to model all orders of HOIs and perform theoretical analyses^{20,21,22,23,32,33,34,35,36,37}, it is currently unclear how to construct parsimonious models to address the diverse effects of HOIs in a principled manner.

To address this challenge, here we employ an extension of the maximum entropy principle to capture HOIs through the deformation of the space of statistical models. When applied to neural networks, our approach generalises classical neural network models to yield a family of curved neural networks that effectively incorporate HOIs of all orders. The resulting models have rich connections with the literature on the statistical physics of neural networks^21,22,27,34. These features enable the exploration of various aspects of HOIs using techniques including mean-field approximations, quenched disorder analyses, and path integrals.

Our analyses reveal how relatively simple curved neural networks exhibit some of the hallmark characteristics of higher-order phenomena, such as explosive phase transitions, arising both in mean-field models and in more complex transitions to spin-glass states. These phenomena are driven by a self-regulated annealing process, which accelerates memory retrieval through positive feedback between energy and an ‘effective’ temperature—a perspective that can also explain memory-retrieval dynamics in other modern artificial networks. Furthermore, we show—both analytically and experimentally—that this mechanism can lead to an increase in the memory capacity or robustness of memory retrieval in these neural networks. Overall, the core contributions of this work are (i) the development of a parsimonious neural network model based on the maximum entropy principle that captures interactions of all orders, (ii) the discovery of a self-regulated annealing mechanism that can drive explosive phase transitions, and (iii) the demonstration of enhanced memory capacity resulting from this mechanism.

Results

High-order interactions in curved manifolds

The maximum entropy principle (MEP) is a general modelling framework based on the principle of adopting the model with maximal entropy compatible with a given set of observations, under the rationale that one should not assume any structure beyond what is specified by the assumptions or features selected from the data^38,39. The traditional formulation of the MEP is based on Shannon’s entropy⁴⁰, and the resulting models correspond to Boltzmann distributions of the form $p({{\boldsymbol{x}}})=\exp \left({\sum }_{a}{\theta }_{a}{f}_{a}({{\boldsymbol{x}}})-\varphi \right)$, where x = (x₁, …, x_n), φ is a normalising potential, and θ_a are parameters constraining the average value of observables $\langle \, {f}_{a}({{\boldsymbol{x}}}) \rangle$. While observables are often set to low orders (e.g. f_i(x) = x_i, f_ij(x) = x_ix_j, corresponding to first and second order statistics), higher-order interdependencies can be included by considering observables of the type f_I(x) = ∏_i∈Ix_i, where I is a set of indices of order k = ∣I∣. Unfortunately, an exhaustive description of interactions up to order k ≫ 1 becomes unfeasible in practice due to an exponential number of terms (for more details on the MEP, see Supplementary Note 1).

The MEP can be expanded to include other entropy functionals such as Tsallis’⁴¹ and Rényi’s⁴². Concretely, maximising the Rényi entropy (with the scaling parameter γ ≥ −1)⁴³

$${H}_{\gamma }(p)=-\frac{1}{\gamma }\ln {\sum}_{{{\boldsymbol{x}}}}p{({{\boldsymbol{x}}})}^{1+\gamma }$$

(1)

while constraining $\langle \, {f}_{a}({{\boldsymbol{x}}})\rangle$ (i.e., the expectation of features by p(x)) results in models of the form (see Supplementary Note 1):

$${p}_{\gamma }({{\boldsymbol{x}}})=\exp (-{\varphi }_{\gamma })\left[1+\gamma \beta {\sum}_{a}{\theta }_{a}{f}_{a}({{\boldsymbol{x}}})\right]_{+}^{1/\gamma },$$

(2)

where φ_γ is a normalising constant given by

$${\varphi }_{\gamma }=\ln {\sum}_{{{\boldsymbol{x}}}} \left[1+\gamma \beta {\sum}_{a}{\theta }_{a}{f}_{a}({{\boldsymbol{x}}})\right]_{\!{+}}^{\!{1/\gamma }}.$$

(3)

Above, the square bracket operator sets negative values to zero, ${\left[x\right]}_{+}=\max \{0,x\}$. We refer to distributions following (2) as the deformed exponential family, which maximises both Rényi and Tsallis entropies^44,45. When γ → 0, Rényi’s entropy tends to Shannon’s and (2) to the standard exponential family⁴².

A fundamental insight explored in this study is that higher-order interdependencies can be efficiently captured by deformed exponential family distributions^46,47. Starting from a standard Shannon’s MEP model with low-order interactions, it can be shown that varying γ in (2) results in a deformation of the statistical manifold which, in turn, enhances the capability of p_γ(x) to account for higher-order interdependencies. In effect, the consequence of deformation can be investigated by rewriting (2) via Taylor expansion of the exponent

$${p}_{\gamma }({{\boldsymbol{x}}})=\exp \left({\sum }_{k=1}^{\infty }\frac{-1}{k\gamma }{\left(-\gamma \beta {\sum}_{a}{\theta }_{a}{f}_{a}({{\boldsymbol{x}}})\right)}^{k}-{\varphi }_{\gamma }\right),$$

(4)

which is valid for the case 1 + γ∑_aθ_af_a(x) > 0, and otherwise p_γ(x) = 0. This shows that the deformed manifold contains interactions of all orders even if f_a(x) is restricted to lower orders while establishing a specific dependency structure across the orders, thereby avoiding a combinatorial explosion of the number of required parameters. The deformation resulting from the maximisation of a non-Shannon entropy has been shown to reflect a curvature of the space of possible models in information geometry^42,45,48,49. This leads to a particular foliation of the space of possible models⁵⁰ (an ‘onion-like’ manifold structure, Fig. 1), which has properties that allow to re-derive the MEP from fundamental geometric properties—for technical details, see Supplementary Note 1.

**Fig. 1: Higher-order decomposition resulting from the foliation of a statistical manifold.**

Curved neural networks

Several well-known neural network models adhere to the MEP, such as Ising-like models⁵¹ and Boltzmann machines⁵². Interestingly, these models can encode patterns in their weights in the form of ‘associative memories’ as in Nakano-Amari-Hopfield networks^53,54,55, being amenable for investigations using tools from equilibrium and nonequilibrium statistical physics literature^56,57,58,59. Following the principles laid down in the previous section, we now introduce a family of recurrent neural networks that we call curved neural networks.

For this purpose, let us consider N binary variables x₁, …, x_N taking values in {1, −1} following a joint probability distribution

$${p}_{\gamma }({{\boldsymbol{x}}})=\exp (-{\varphi }_{\gamma }){\left[1-\gamma \beta E({{\boldsymbol{x}}})\right]}_{+}^{1/\gamma },$$

(5)

where φ_γ is a normalising constant. Above, we call E(x) and β the (stochastic) energy function (i.e., Hamiltonian) and the inverse temperature, due to their similarity with the Gibbs distribution in statistical physics when γ → 0. Note that, unlike exponential families, these models do not exhibit energy invariance under constant shifts. However, as demonstrated in Ref. ⁴¹, deformed exponential models can be related to energy-invariant models by rescaling their temperature, which can be seen as maximising entropy with respect to escort statistics rather than the original natural statistics.

Neural network models are typically defined by considering p_γ(x) as defined in (5) with an energy function of the form

$$E({{\boldsymbol{x}}})=-{\sum }_{i=1}^{N}{H}_{i}{x}_{i}-\frac{1}{N}{\sum}_{i < j}{J}_{ij}{x}_{i}{x}_{j},$$

(6)

where J_ij is the coupling strength between neurons x_i and x_j, and H_i are bias terms. In the limit γ → 0, p₀(x) recovers the Ising model. Emulating classical associative memories, the weights J_ij can be made to encode a collection of M neural patterns ${{{\boldsymbol{\xi }}}}^{a}=\{{\xi }_{1}^{a},\ldots {\xi }_{N}^{a}\}$, ${\xi }_{1}^{a}=\pm \! 1$ and a = 1, …, M by using the well-known Hebbian rule^55,56

$${J}_{ij}=J{\sum }_{a=1}^{M}{\xi }_{i}^{a}{\xi }_{j}^{a},$$

(7)

where J is a scaling parameter.

Before proceeding with our main analysis, one can gain insights into the effect of the curvature γ from the dynamics of a recurrent neural network that behaves as a sampler of the equilibrium distribution described by (5). For this, we adapt the classic Glauber dynamics to curved neural networks (see Supplementary Note 2) to obtain

$$p({x}_{i}| {{{\boldsymbol{x}}}}_{\backslash i})={\left(1+{\left[1-\gamma {\beta }^{{\prime} }({{\boldsymbol{x}}})\Delta E({{\boldsymbol{x}}})\right]}_{\!\!+}^{\!{1}/\gamma }\right)}^{-1},$$

(8)

where x_\i denotes the state of all neurons except x_i, $\Delta E({{\boldsymbol{x}}})=2{x}_{i}({H}_{i}+\frac{1}{N}{\sum }_{j}\,{J}_{ij}{x}_{j})$ is the energy difference associated with detailed balance, and ${\beta }^{{\prime} }({{\boldsymbol{x}}})$ is an effective inverse temperature given by

$${\beta }^{{\prime} }({{\boldsymbol{x}}})=\frac{\beta }{{\left[1-\gamma \beta E({{\boldsymbol{x}}})\right]}_{+}}.$$

(9)

Again, γ → 0 recovers the classic Glauber dynamics and ${\beta }^{{\prime} }({{\boldsymbol{x}}})=\beta$. Thus, the curvature affects the dynamics through the deformed nonlinear activation function (8) and the state-dependent effective temperature ${\beta }^{{\prime} }({{\boldsymbol{x}}})$ (9), with higher ${\beta }^{{\prime} }({{\boldsymbol{x}}})$ inducing lower degrees of randomness in the transitions. The effect of E(x) on ${\beta }^{{\prime} }({{\boldsymbol{x}}})$ depends then on the sign of γ. A negative γ increases ${\beta }^{{\prime} }({{\boldsymbol{x}}})$ during relaxation, reducing the stochasticity of the dynamics and accelerating convergence to a low-energy state. This, in turn, raises ${\beta }^{{\prime} }$, creating a positive feedback loop between energy and effective temperature. The effect is similar to simulated annealing, but the coupling of the energy and effective inverse temperature lets the annealing scheduling self-regulate to accelerate convergence. In contrast, positive γ decelerates the dynamics through negative feedback. Such accelerating or decelerating dynamics underlie non-trivial complex collective behaviours of the curved neural networks, which will be examined in the subsequent sections.

Mean-field behaviour of curved associative-memory networks

As with regular associative memories⁵⁸, one can solve the behaviour of curved associative-memory networks through mean-field methods in the thermodynamic limit N → ∞ (Supplementary Note 3). Here the energy is extensive, meaning that it scales with the system’s size N. To ensure the deformation parameter remains independent of system properties such as size or temperature, we scale it as follows:

$$\gamma=\frac{{\gamma }^{{\prime} }}{N\beta }.$$

(10)

Under this condition, we calculate the normalising potential φ_γ by introducing a delta integral and calculating a saddle-node solution, resulting in a set of order parameters m = {m₁, …, m_M}, ${m}_{a}=\frac{1}{N}{\sum }_{i}{\xi }_{i}^{a}\langle {x}_{i}\rangle$ in the limit of size N → ∞. This calculation assumes 1 − γβE(x) > 0 so that ${[]}_{+}$ operators can be omitted and φ_γ is differentiable. The solution results in (for H_i = 0):

$${\varphi }_{\gamma }= N\frac{\beta }{{\gamma }^{{\prime} }}\ln \frac{{\beta }^{{\prime} }}{\beta }-{\sum }_{a=1}^{M}{\beta }^{{\prime} }NJ{m}_{a}^{2}\\ +{\sum }_{i=1}^{N}\ln \left(2\cosh \left({\beta }^{{\prime} }J{\sum }_{a=1}^{M}{\xi }_{i}^{a}{m}_{a}\right)\right),$$

(11)

where ${\beta }^{{\prime} }$ is given by

$${\beta }^{{\prime} }=\frac{\beta }{1+{\gamma }^{{\prime} }\frac{1}{2}J{\sum }_{a}{m}_{a}^{2}},$$

(12)

and the values of the mean-field variables m_a are found from the following self-consistent equations:

$${m}_{a}={\sum }_{i=1}^{N}\frac{{\xi }_{i}^{a}}{N}\tanh \left({\beta }^{{\prime} }J{\sum }_{b=1}^{M}{\xi }_{i}^{b}{m}_{b}\right).$$

(13)

Similarly, using a generating functional approach⁵⁹, we use the Glauber rule in (8) to derive a dynamical mean-field given by path integral methods (see Supplementary Note 4). This yields

$${\dot{m}}_{a}=-{m}_{a}+{\sum }_{i=1}^{N}\frac{{\xi }_{i}^{a}}{N}\tanh \left({\beta }^{{\prime} }J{\sum }_{b=1}^{M}{\xi }_{i}^{b}{m}_{b}\right),$$

(14)

where ${\beta }^{{\prime} }$ is defined as in (12) for each m. Note that in large systems, we recover the classical nonlinear activation function, and the deformation affects the dynamics only through the effective temperature ${\beta }^{{\prime} }$.

Explosive phase transitions

To illustrate these findings, let us focus on a neural network with a single associative pattern (M = 1), which is similar to the Mattis model⁶⁰ and equivalent to a homogeneous mean-field Ising model⁶¹ (with energy $E({{\boldsymbol{x}}})=-\frac{1}{N}J{\sum }_{i < j}{x}_{i}{x}_{j}$) by changing a variable x_i ← ξ_ix_i. Rewriting (13), we find that a one-pattern curved neural network follows a mean-field model given by

$$m=\tanh \left({\beta }^{{\prime} }Jm\right),$$

(15)

$${\beta }^{{\prime} }=\frac{\beta }{1+{\gamma }^{{\prime} }\frac{1}{2}J{m}^{2}}.$$

(16)

This result generalises the well-known Ising mean-field solution $m=\tanh \left(\beta Jm\right)$, which is recovered for γ = 0.

By evaluating these equations, one finds that the model exhibits the usual order-disorder phase transition for positive and small negative values of ${\gamma }^{{\prime} }$ (Fig. 2a top). However, for large negative values of ${\gamma }^{{\prime} }$, a different behaviour emerges: an explosive phase transition⁸ that displays hysteresis due to HOIs (Fig. 2a bottom). The resulting phase diagram (Fig. 2b) closely resembles phase transitions in higher-order contagion models^9,11 and higher-order synchronisation observed in Kuramoto models³⁰.

**Fig. 2: Explosive phase transitions in curved neural networks.**

One can intuitively interpret the effect of the deformation parameter ${\gamma }^{{\prime} }$ by noticing that, for a fixed ${\beta }^{{\prime} }$, m is the solution of a function of ${\beta }^{{\prime} }$. For ${\gamma }^{{\prime} }=0$, this results in the mean-field behaviour of the regular exponential model, which assigns a value of m to each inverse temperature $\beta={\beta }^{{\prime} }$. In the case of the deformed model, the possible pairs of solutions $(m,{\beta }^{{\prime} })$ are the same, but their mapping to the inverse temperatures β changes. Namely, this deformation can be interpreted as a stretching (or contraction) of the effective temperature, which maps each pair $(m,{\beta }^{{\prime} })$ to an inverse temperature $\beta={\beta }^{{\prime} }(1+\frac{1}{2}{\gamma }^{{\prime} }J{m}^{2})$ according to (16). Thus, one can obtain the mean-field solutions of the deformed patterns as mappings of the solutions of the original model. This is illustrated in Fig. 2c, where the solution of ${\beta }^{{\prime} },m,\beta$ is projected to the planes β = 0 and ${\beta }^{{\prime} }=0$, obtaining the solutions for the flat (${\gamma }^{{\prime} }=0$) and the deformed (${\gamma }^{{\prime} }=-1.2$) models respectively.

In order to gain a deeper understanding of the explosive nature of this phase transition, we study the dynamics of the single-pattern neural network. By rewriting (14) for M = 1, and under the change of variables mentioned above to remove ξ, the dynamical mean-field equation of the system reduces to

$$\dot{m}=-m+\tanh \left({\beta }^{{\prime} }Jm\right),$$

(17)

where ${\beta }^{{\prime} }$ is calculated as in (16). Simulations of the dynamical mean-field equations for values of β just above the critical point are depicted in Fig. 2d. Trajectories with strongly negative ${\gamma }^{{\prime} }$ saturate earlier than smaller negative ${\gamma }^{{\prime} }$, confirming accelerated convergence. During this process, the effective inverse temperature ${\beta }^{{\prime} }$ rapidly increases until it saturates, creating a positive feedback loop between ${\beta }^{{\prime} }$ and m that gives rise to the explosive nature of the phase transition. This positive loop occurs only if ${\gamma }^{{\prime} }$ is negative; otherwise, negative feedback simply makes the convergence of m slower.

Overlaps between memory basins of attraction

A key property of associative-memory networks is their ability to retrieve patterns in different contexts. In the case of one-pattern associative-memory networks, the energy function $E({{\boldsymbol{x}}})=-\frac{J}{N}{\sum }_{i < j}{x}_{i}{\xi }_{i}{\xi }_{j}{x}_{j}$ is a quadratic function with two minima at x = ± ξ, which configure global attractors. Instead, a two-pattern associative-memory network has an energy function with four minima (if sufficiently separated), but their attraction basins can overlap when the patterns are correlated.

To study the degree of the overlap between pairs of patterns, we analyse solutions of (13) for a network with two patterns with correlation $\langle {\xi }_{i}^{1}{\xi }_{i}^{2}\rangle=C$ (see Supplementary Note 3.3 for details). In this scenario, the system is described by two mean-field patterns:

$${m}_{a}= \frac{1}{2}(1+C)\tanh \left({\beta }^{{\prime} }J({m}_{1}+{m}_{2})\right)\\ +w\frac{1}{2}(1-C)\tanh \left({\beta }^{{\prime} }J({m}_{1}-{m}_{2})\right)$$

(18)

with w = 3 − 2a = ± 1 for a = 1, 2, and

$${\beta }^{{\prime} }=\frac{\beta }{1+\gamma \frac{1}{2}J({m}_{1}^{2}+{m}_{2}^{2})}.$$

(19)

Figure 3 shows how the hysteresis effect and explosive phase transitions persist in the case of two patterns for C = 0.2 with negative ${\gamma }^{{\prime} }$. This example shows two consecutive, overlapping explosive bifurcations (going from 1 to 2, and then to 4 fixed points), creating a hysteresis involving 7 fixed points within a more compressed parameter range of β than the classical case. Consequently, the memory-retrieval region for the four embedded memories expands. These results illustrate complex hysteresis cycles as well as an increased memory capacity for finite temperatures by negative values of ${\gamma }^{{\prime} }$. This enhanced capability for memory retrieval is further investigated through the replica analyses in the next section.

**Fig. 3: Interaction between two encoded memories.**

Memory retrieval with an extensive number of patterns

Next, we investigate how the deformation related to γ impacts the memory-storage capacity of associative memories. In classical associative networks of N neurons, the energy function is defined as $E({{\boldsymbol{x}}})=-\frac{J}{N}{\sum }_{a=1}^{M}{\sum }_{i < j}{x}_{i}{\xi }_{i}^{a}{\xi }_{j}^{a}{x}_{j}$ with M = αN. As the number of patterns learned by the network increases, the system transitions to a disordered spin-glass state in the thermodynamic limit. Furthermore, one can analytically solve this model^62,63,64,65. For example, using the replica-trick method can determine the memory capacity of the system⁶², and theoretically identify the critical value of α at which memory retrieval becomes impossible—leading to a disordered spin-glass phase. Here, we apply a similar approach to reveal how deformed associative memory networks afford an enhanced memory capacity.

Applying the replica trick in conjunction with the methods outlined in previous sections allows us to solve the system (see Supplementary Note 5). This method entails computing a mean-field variable m corresponding to one of the patterns ξ ^a and averaging over the others. For simplicity, a pattern with all positive unity values ξ ^a = (1, 1, …, 1) is considered, which is equivalent to any other single pattern just by a series of sign flip variable changes. The degree of similarity or overlap of this pattern with other patterns in the system introduces a new order parameter q, which contributes to measuring disorder in the system. After introducing the relevant order parameters and solving under a replica-symmetry assumption, the normalising potential is derived as

$${\varphi }_{\gamma }= N\frac{\beta }{{\gamma }^{{\prime} }}\ln \frac{\beta }{{\beta }^{{\prime} }}-N{\beta }^{{\prime} }J{m}^{2}-N\frac{1}{2}\alpha {({\beta }^{{\prime} }J)}^{2}(r+R-2qr)\\ -N\frac{1}{2}\alpha \left(\ln \left(1-{\beta }^{{\prime} }J(1-q)\right)-{\beta }^{{\prime} }J\sqrt{rq}\right)\\ +N\int\,Dz\,\ln \left(2\cosh \left({\beta }^{{\prime} }Jm+{\beta }^{{\prime} }J\sqrt{\alpha r}z\right)\right),$$

(20)

where J is a scaling factor, and the order parameters are defined as

$$m=\int\,Dz\,\tanh \left({\beta }^{{\prime} }Jm+{\beta }^{{\prime} }J\sqrt{\alpha r}z\right),$$

(21)

$$q=\int\,Dz\,{\tanh }^{2}\left({\beta }^{{\prime} }Jm+{\beta }^{{\prime} }J\sqrt{\alpha r}z\right),$$

(22)

with

$$r=\frac{q}{{(1-{\beta }^{{\prime} }J(1-q))}^{2}},\quad R=\frac{{({\beta }^{{\prime} }J)}^{-1}-(1-2q)}{{(1-{\beta }^{{\prime} }J(1-q))}^{2}}.$$

(23)

As in previous cases, the model is governed by an effective temperature

$${\beta }^{{\prime} }=\frac{\beta }{1+{\gamma }^{{\prime} }\frac{1}{2}\left(J{m}^{2}+\alpha J({\beta }^{{\prime} }(R-qr)-1)\right)}.$$

(24)

This solution differs from the models in previous sections by the self-dependence of ${\beta }^{{\prime} }$.

To obtain a phase diagram, we solved (21)-(22) numerically for given $\alpha,{\beta }^{{\prime} }$ at ${\gamma }^{{\prime} }=0$, and rescaled the inverse temperature as in the previous section to obtain the corresponding values of β for each ${\gamma }^{{\prime} }$. Using the resulting order parameters and calculating the free energy for each $\alpha,\beta,{\gamma }^{{\prime} }$, we constructed the phase diagram of the system (similarly to regular associative memories^58,62) characterised by the following distinct phases (Fig. 4):

A paramagnetic phase (P), corresponding to disordered solutions with m = q = 0, where memory-retrieval fails due to the dominance of fluctuations.
A ferromagnetic phase (F), corresponding to stable memory-retrieval solutions with m > 0 and q > 0.
A spin-glass phase (SG), exhibiting spurious-retrieval solutions with m = 0 and q > 0.
A mixed phase (M), where F and SG types of solutions coexist, being the spin-glass solutions a global minimum of the normalising potential φ_γ.

**Fig. 4: Memory capacity is enhanced by geometric deformation.**

For ${\gamma }^{{\prime} }=0$ (black dashed lines), the phase transition reflects the behaviour of associative memories near saturation^58,62. With negative ${\gamma }^{{\prime} }$ (red lines), we observe an expansion of the ferromagnetic and mixed phases, indicating an enhanced memory-storage capacity by the deformation. Conversely, a positive value of ${\gamma }^{{\prime} }$ (yellow lines) decreases the memory capacity but reduces the extent of the mixed phase. In the mixed phase, retrieved memories (m > 0) are represented at a local—but not global—minimum of the normalising potential φ_γ in (20), indicating a larger probability of observing spurious patterns. Thus, we expect positive values of ${\gamma }^{{\prime} }$ to result in more robust memory retrieval.

The stability of the replica symmetry solution is given by the condition

$${\left(1+{\beta }^{{\prime} }(1-q)\right)}^{2} > \alpha {\beta }^{{\prime} 2}\int\,Dz\,{\cosh }^{-4}{\beta }^{{\prime} }\left(Jm+J\sqrt{\alpha r}z\right),$$

(25)

which is captured by the dotted lines near zero temperature in Fig. 4a. Note that all solutions in Fig. 4b are stable under the replica symmetry assumption.

We complement the analysis from the previous section with an experimental study of a system encoding patterns from an image classification benchmark. The patterns are sourced from the CIFAR-100 dataset, which comprises 60,000 32 × 32 colour images⁶⁶. To adapt the dataset to binary patterns suitable for storage in an associative memory, we processed each RGB channel by assigning a value of 1 to pixels with values greater than the channel’s median value and −1 otherwise (Fig. 5a). The resulting array of N = 32 ⋅ 32 ⋅ 3 binary values for each image was assigned to patterns ξ ^a. Note that associative memories (as well as our theory above) usually assume that patterns are relatively uncorrelated, and specific methods are required to adapt them to correlated patterns^67,68. To simplify the problem, we conducted experiments using a selection of 100 images with covariance values smaller than $10/\sqrt{N}$ (the standard deviation of the covariance values for uncorrelated patterns is $1/\sqrt{N}$). We used a random search to select patterns with low correlations: we randomly picked an image and replaced it if its correlation exceeded the threshold, repeating until all correlations were below it.

**Fig. 5: Simulation study for the effect of deformation on image encoding.**

We evaluated the memory retrieval capacity of networks with various degrees of curvature γ by encoding different numbers of memories, as described in (7). As a measure of performance, we evaluated the stability of the network by assigning an initial state x = ξ ^a and calculating the overlap $o={\sum }_{i}{x}_{i}{\xi }_{i}^{a}$ after T = 30N Glauber updates for β = 2, J = 1. The process was repeated R = 500 times from different initial conditions (different encoded patterns and different initial states) to estimate the value of m in (21). Experimental outcomes confirm our theoretical results, revealing that memory capacity increases with negative values of ${\gamma }^{{\prime} }$, while positive values reduce the memory capacity (Fig. 5b), but reduce the extent and magnitude of the high variability region in pattern retrieval (Fig. 5c), which is consistent with the reduction of the mixed phase. Note that the resulting memory capacity of the system observed in our experiments (i.e., the value of α at which the transition happens) is diminished due to the presence of correlations among some of the memorised patterns.

Finally, we investigated transitions near the spin-glass phase boundaries. First, we note that, for J → 0 and α = J⁻², the model in (21)-(22) converges to (see Supplementary Note 5)

$$q=\int\,Dz\,{\tanh }^{2}\left({\beta }^{{\prime} }\sqrt{q}z\right),$$

(26)

$${\beta }^{{\prime} }=\frac{\beta }{1+\frac{1}{2}{\gamma }^{{\prime} }{\beta }^{{\prime} }(1-{q}^{2})},$$

(27)

which at γ = 0 recovers the well-known Sherrington-Kirkpatric model⁶⁹ (see Supplementary Note 6). While in the classical case, a phase transition occurs from a paramagnetic to a spin-glass phase, the curvature effect of ${\gamma }^{{\prime} }\ne 0$ modifies the nature of this transition. For small values of ${\gamma }^{{\prime} }$, the system exhibits a continuous phase transition akin to the Sherrington–Kirkpatrick spin-glass, where $\frac{dq}{d\beta }$ shows a cusp (Fig. 6a). However, for ${\gamma }^{{\prime} }=-\!1$ the phase transition becomes second-order, displaying a divergence of $\frac{dq}{d\beta }$ at the critical point (Fig. 6b). Moreover, increasing the magnitude of negative ${\gamma }^{{\prime} }$ leads to a first-order phase transition with hysteresis (Fig. 6c), resembling the explosive phase transition observed in the single-pattern associative-memory network. This hybrid phase transition combines the typical critical divergence of a second-order phase transition with a genuine discontinuity, similar to ‘type V’ explosive phase transitions⁸.

We analytically calculated the properties of these phase transitions (see Supplementary Note 6). By computing the solution at ${\gamma }^{{\prime} }=0$ and rescaling ${\beta }^{{\prime} }$, we determined that the critical point is located at ${\beta }_{c}=1+\frac{1}{2}{\gamma }^{{\prime} }$ (consistent with Fig. 6a–c). The slope of the order parameter around the critical point is, for ${\gamma }^{{\prime} }\le -\!1$, equal to ${(1+{\gamma }^{{\prime} })}^{-1}$, indicating the onset of a second-order phase transition as depicted in Fig. 6b. The resulting phase diagram of the curved Sherrington-Kirkpatrick model is shown in Fig. 6d.

Comparison with other dense associative memory models

Although our primary objective is to develop a parsimonious model of HOIs to explain higher-order phenomena, our framework can also be used to explain the behaviour of modern networks with HOIs, including the recently proposed relativistic Hopfield model^32,33,34 and dense associative memories^20,21. For this, let us consider the energy ${{\mathcal{F}}}[E]$ of the exponential family distribution $p({{\boldsymbol{x}}}) \sim {e}^{-\beta {{\mathcal{F}}}[E]}$ given by the nonlinear transformation (denoted by ${{\mathcal{F}}}$) of the classical energy E(x). The deformed exponential models in this study correspond to ${{\mathcal{F}}}[E]=-\frac{N}{{\gamma }^{{\prime} }}\ln (1-{\gamma }^{{\prime} }E/N)$, while the relativistic model corresponds to ${{\mathcal{F}}}[E]=-\frac{N}{{\gamma }^{{\prime} }}\sqrt{1-{\gamma }^{{\prime} }E/N}$. For the deformed exponential, the term ${{\mathcal{F}}}[E]$ can be expanded as

$${{\mathcal{F}}}[E]=E+\frac{{\gamma }^{{\prime} }}{2N}{E}^{2}+\frac{{\gamma }^{{\prime} 2}}{3{N}^{2}}{E}^{3}+\ldots$$

(28)

When E depends on the quadratic Mattis magnetisation (i.e., $E=-{\sum }_{a}\frac{1}{N}{\left({\sum }_{i}{\xi }_{i}^{a}{x}_{i}\right)}^{2}$), then ${{\mathcal{F}}}[E]$ expands in terms of even-order HOIs of ${\sum }_{i}{\xi }_{i}^{a}{x}_{i}$. For ${\gamma }^{{\prime} } < 0$, all coefficients of ${\sum }_{i}{\xi }_{i}^{a}{x}_{i}$ in the expansion are negative, indicating that embedded memories have deeper energy minima than in the classical case. The same signs appear for each order in the relativistic energy with ${\gamma }^{{\prime} } < 0$. We also note that β in the free energy of both the deformed exponential and relativistic models in the limit of large N appears scaled according to an effective temperature given by ${\beta }^{{\prime} }=\beta {\partial }_{E}{{\mathcal{F}}}[E]$ (e.g., (11) and Eq. (6.2) in Ref. ³⁴). Moreover, the input in the Glauber dynamics is approximated for large sizes as

$$\beta \Delta {{\mathcal{F}}}[E]\approx \beta {\partial }_{E}{{\mathcal{F}}}[E]\,\Delta E({{\boldsymbol{x}}})={\beta }^{{\prime} }\Delta E({{\boldsymbol{x}}}).$$

(29)

The effective inverse temperatures ${\beta }^{{\prime} }=\beta {(1-{\gamma }^{{\prime} }E/N)}^{-1}$ for the deformed exponential and ${\beta }^{{\prime} }={2}^{-1}{(1-{\gamma }^{{\prime} }E/N)}^{-1/2}$ for the relativistic models are decreasing functions of E when ${\gamma }^{{\prime} } < 0$, resulting in an acceleration of memory retrieval—with lower energy E resulting in higher ${\beta }^{{\prime} }$ (lower temperature). While the relativistic model has been studied for ${\gamma }^{{\prime} } > 0$^32,33,34, we conjecture it may exhibit explosive phase transitions if ${\gamma }^{{\prime} } < 0$. Conversely, a positive ${\gamma }^{{\prime} }$ introduces alternating signs in even-order terms of ${\sum }_{i}{\xi }_{i}^{a}{x}_{i}$, and a shallower energy landscape due to a reduction in ${\beta }^{{\prime} }$. This shallower energy landscape reduces the memory capacity of the deformed exponential networks by expanding the spin-glass phases (Fig. 4), but also enlarges the recall (ferromagnetic) region by mitigating the formation of spurious memories given by overlapping patterns in the mixed phase (in alignment with previous work³² on mitigation of spurious memories in the relativistic model).

This perspective on accelerated memory retrieval by nonlinearity extends to dense associative memories^20,21, which achieve supralinear memory capacities through nonlinear pattern encoding. Specifically, their energy function is given by ${{\mathcal{F}}}=-\!{\sum }_{a}F({\sum }_{i}{\xi }_{i}^{a}{x}_{i})$ with F being e.g., a thresholded power function²⁰, $F(z)={\left[z\right]}_{\!+}^{p}$ or an exponential nonlinearity²¹ F(z) = e^z at zero temperature. These nonlinearities narrow basins of attraction, reducing memory overlap and preventing transitions to the spin-glass phase. The jumps in the Glauber dynamics of such systems are weighed by an accelerating function. Namely, from our perspective, the dynamics of such systems can be described via positive feedback on weights linked to a specific memory, which increase during memory retrieval. This follows from the fact that, relating the linear difference in Mattis terms $\Delta {\epsilon }_{k}^{a}\equiv 2{\xi }_{k}^{a}{x}_{k}$ with the nonlinear difference $\Delta {F}_{k}^{a}\equiv F\left({\sum }_{i}{\xi }_{i}^{a}{x}_{i}\right)-F\left({\sum }_{i}{\xi }_{i}^{a}{x}_{i}-\Delta {\epsilon }_{k}^{a}\right)$, the update of the kth neuron is determined by the sign of

$$\Delta {{\mathcal{F}}}({{\boldsymbol{x}}})=\mathop{\sum}_{a}\frac{\Delta {F}_{k}^{a}}{\Delta {\epsilon }_{k}^{a}}\Delta {\epsilon }_{k}^{a}={\sum}_{a}{w}_{k}^{a}\Delta {\epsilon }_{k}^{a}.$$

(30)

Here, we show that the effective weight ${w}_{k}^{a}\equiv \frac{\Delta {F}_{k}^{a}}{\Delta {\epsilon }_{k}^{a}}$ becomes an increasing function of ${\sum }_{i}{\xi }_{i}^{a}{x}_{i}$ when F is the power, exponential, or more generally, a convex function (See Supplementary Note 7). Thus, increasing ${\sum }_{i}{\xi }_{i}^{a}{x}_{i}$ as pattern ξ ^a is retrieved strengthens its basin of attraction and ensures positive feedback. Meanwhile, retrieval of ξ ^a reduces ${\sum }_{i}{\xi }_{i}^{b}{x}_{i}$ for orthogonal patterns ξ ^b, lowering their weights, suppressing their recall to minimise interference. This competitive mechanism highlights the higher memory capacity of these models compared to curved neural networks with uniform temperature scaling. Unlike the effective inverse temperature in curved networks, which depends only on the system’s state or energy, the effective weight in updating the k-th neuron additionally depends on the neuron’s state x_k, thus no longer representing a global modulation of the energy.

Discussion

HOIs play a critical role in enabling emergent collective phenomena in natural and artificial systems. Modelling HOIs is, however, highly non-trivial, often requiring advanced analytic tools (such as simplicial complexes or hypergraphs) that entail an exponential increase in parameters for large systems. In this paper, we addressed this issue by leveraging the maximum entropy principle to effectively capture HOIs in models via a deformation parameter γ, which is associated with the Rényi entropy. Given their close connection with statistical physics, this family of models provides a useful setup to investigate the effect of HOIs on spin systems, including explosive ferromagnetic and spin-glass phase transitions, extending studies on anomalous phase transitions found in other systems^2,7,8,9,11, and the capability of networks to store memories.

The observed effects in curved neural networks can be explained via an effective temperature, inducing a positive or negative feedback effect in memory retrieval. As we discussed above, this effect is present in different forms across other dense associative memories^20,21,34. A similar argument may apply to diffusion models framed within dense associative memories^25,26, where the energy follows a log-sum-exp nonlinearity. Thus, the accelerated mechanism found in this study clarifies memory retrieval in advanced associative networks, providing an important step toward designing extended memory capacities and improved noise scheduling.

Curved neural networks also provide insights into biological neural systems, where evidence suggests the presence of alternating positive and negative HOIs for even and odd orders, respectively. This alternation leads to sparse neuronal activity, which has been shown to be instrumental for enabling extended periods of total silence^{5,13,14,15,35}. Interestingly, such sparse activity patterns may coexist with the accelerated memory retrieval dynamics, as both involve positive even-order HOIs. The attainment of enhanced memory, combined with sparse activity, presents a promising direction for understanding energy-efficient biological neuronal networks^35,36. Future work may investigate how curved neural networks might support both energy efficiency and high memory capacities, potentially by adopting a thresholded, supralinear neuronal activation function^20,35. Additionally, developing statistical methods for fitting these models to experimental data (i.e., theories for learning) represents an important, yet largely unexplored, research avenue. Together, these research directions offer a compelling path to uncover the principles of efficient information coding in biological neural systems.

Overall, our results demonstrate the benefits of considering the maximum entropy principle, emergent HOIs, and nonlinear network dynamics as theoretically intertwined notions. As showcased here, such an integrated framework reveals how information encoding, retrieval dynamics, and memory capacity in neural networks are mediated by HOIs, providing principled, analytically tractable tools and insights from statistical mechanics and nonlinear dynamics. More generally, the framework presented in this work extends beyond neural networks and contributes to a general theory of HOIs, paving the road toward a principled study of higher-order phenomena in complex networks.

Data availability

The CIFAR-100 dataset used in this study is available at https://www.cs.toronto.edu/~kriz/cifar.html.

Code availability

The code generated in this study is available in the GitHub repository, https://github.com/MiguelAguilera/explosive-neural-networks.

References

Lambiotte, R., Rosvall, M. & Scholtes, I. From networks to optimal higher-order models of complex systems. Nat. Phys. 15, 313–320 (2019).
Article CAS PubMed PubMed Central Google Scholar
Battiston, F. et al. The physics of higher-order interactions in complex systems. Nat. Phys. 17, 1093–1098 (2021).
Article CAS Google Scholar
Amari, S.-i, Nakahara, H., Wu, S. & Sakai, Y. Synchronous firing and higher-order interactions in neuron pool. Neural Comput. 15, 127–142 (2003).
Article PubMed Google Scholar
Kuehn, C. & Bick, C. A universal route to explosive phenomena. Sci. Adv. 7, eabe3824 (2021).
Article ADS PubMed PubMed Central Google Scholar
Shomali, S. R., Rasuli, S. N., Ahmadabadi, M. N. & Shimazaki, H. Uncovering hidden network architecture from spiking activities using an exact statistical input-output relation of neurons. Commun. Biol. 6, 169 (2023).
Article PubMed PubMed Central Google Scholar
Thibeault, V., Allard, A. & Desrosiers, P. The low-rank hypothesis of complex systems. Nat. Phys. 20, 294–302 (2024).
Article CAS Google Scholar
Angst, S., Dahmen, S. R., Hinrichsen, H., Hucht, A. & Magiera, M. P. Explosive ising. J. Stat. Mech.: Theory Exp. 2012, L06002 (2012).
Article Google Scholar
D’Souza, R. M., Gómez-Gardenes, J., Nagler, J. & Arenas, A. Explosive phenomena in complex networks. Adv. Phys. 68, 123–223 (2019).
Article ADS Google Scholar
Iacopini, I., Petri, G., Barrat, A. & Latora, V. Simplicial models of social contagion. Nat. Commun. 10, 2485 (2019).
Article ADS PubMed PubMed Central Google Scholar
Millán, A. P., Torres, J. J. & Bianconi, G. Explosive higher-order Kuramoto dynamics on simplicial complexes. Phys. Rev. Lett. 124, 218301 (2020).
Article ADS MathSciNet PubMed Google Scholar
Landry, N. W. & Restrepo, J. G. The effect of heterogeneity on hypergraph contagion models. Chaos 30 (2020).
Montani, F. et al. The impact of high-order interactions on the rate of synchronous discharge and information transmission in somatosensory cortex. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 367, 3297–3310 (2009).
Article ADS MathSciNet Google Scholar
Tkačik, G. et al. Searching for collective behavior in a large network of sensory neurons. PLoS Comput. Biol. 10, e1003408 (2014).
Article PubMed PubMed Central Google Scholar
Ohiorhenuan, I. E. et al. Sparse coding and high-order correlations in fine-scale cortical networks. Nature 466, 617–621 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Shimazaki, H., Sadeghi, K., Ishikawa, T., Ikegaya, Y. & Toyoizumi, T. Simultaneous silence organizes structured higher-order interactions in neural populations. Sci. Rep. 5, 9821 (2015).
Article CAS PubMed PubMed Central Google Scholar
Tkačik, G. et al. The simplest maximum entropy model for collective behavior in a neural network. J. Stat. Mech.: Theory Exp. 2013, P03011 (2013).
Article MathSciNet Google Scholar
Tkačik, G. et al. Thermodynamics and signatures of criticality in a network of neurons. Proc. Natl Acad. Sci. 112, 11508–11513 (2015).
Article ADS PubMed PubMed Central Google Scholar
Burns, T. F. & Fukai, T. Simplicial Hopfield networks. In: The Eleventh International Conference on Learning Representations (2022).
Bybee, C. et al. Efficient optimization with higher-order Ising machines. Nat. Commun. 14, 6033 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Krotov, D. & Hopfield, J. J. Dense associative memory for pattern recognition. Adv. Neural Inform. Process. Syst. 29 (2016).
Demircigil, M., Heusel, J., Löwe, M., Upgang, S. & Vermet, F. On a model of associative memory with huge storage capacity. J. Stat. Phys. 168, 288–299 (2017).
Article ADS MathSciNet Google Scholar
Agliari, E. et al. Dense Hebbian neural networks: a replica symmetric picture of unsupervised learning. Phys. A: Stat. Mech. Appl. 627, 129143 (2023).
Article MathSciNet Google Scholar
Lucibello, C. & Mézard, M. Exponential capacity of dense associative memories. Phys. Rev. Lett. 132, 077301 (2024).
Article ADS MathSciNet CAS PubMed Google Scholar
Krotov, D. A new frontier for Hopfield networks. Nat. Rev. Phys. 5, 366–367 (2023).
Article Google Scholar
Ambrogioni, L. In search of dispersed memories: Generative diffusion models are associative memory networks. Entropy 26, 381 (2024).
Article ADS PubMed PubMed Central Google Scholar
Ambrogioni, L. The statistical thermodynamics of generative diffusion models: Phase transitions, symmetry breaking, and critical instability. Entropy 27, 291 (2025).
Bovier, A. & Niederhauser, B. The spin-glass phase-transition in the Hopfield model with p-spin interactions. Adv. Theor. Math. Phys. 5, 1001–1046 (2001).
Agliari, E., Fachechi, A. & Marullo, C. Nonlinear PDEs approach to statistical mechanics of dense associative memories. J. Math. Phys. 63 (2022).
Amari, S.-i. Information geometry on hierarchy of probability distributions. IEEE Trans. Inf. theory 47, 1701–1711 (2001).
Article MathSciNet Google Scholar
Skardal, P. S. & Arenas, A. Higher order interactions in complex networks of phase oscillators promote abrupt synchronization switching. Commun. Phys. 3, 218 (2020).
Article Google Scholar
Ganmor, E., Segev, R. & Schneidman, E. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proc. Natl Acad. Sci. 108, 9679–9684 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Barra, A., Beccaria, M. & Fachechi, A. A new mechanical approach to handle generalized Hopfield neural networks. Neural Netw. 106, 205–222 (2018).
Article PubMed Google Scholar
Agliari, E., Barra, A. & Notarnicola, M. The relativistic Hopfield network: rigorous results. J. Math. Phys. 60 (2019).
Agliari, E., Alemanno, F., Barra, A. & Fachechi, A. Generalized guerra’s interpolation schemes for dense associative neural networks. Neural Netw. 128, 254–267 (2020).
Article PubMed Google Scholar
Rodríguez-Domínguez, U. & Shimazaki, H. Alternating shrinking higher-order interactions for sparse neural population activity. Preprint at https://arxiv.org/abs/2308.13257 (2023).
Santos, S., Niculae, V., McNamee, D. & Martins, A. F. Hopfield-fenchel-young networks: a unified framework for associative memory retrieval. Preprint at https://arxiv.org/abs/2411.08590 (2024).
Hoover, B., Chau, D. H., Strobelt, H., Ram, P. & Krotov, D. Dense associative memory through the lens of random features. Adv. Neural Inform. Process. Syst. 38 (2024).
Jaynes, E. T. Probability Theory: The Logic of Science (Cambridge University Press, 2003).
Cofré, R., Herzog, R., Corcoran, D. & Rosas, F. E. A comparison of the maximum entropy principle across biological spatial scales. Entropy 21, 1009 (2019).
Article ADS MathSciNet PubMed Central Google Scholar
Jaynes, E. T. Information theory and statistical mechanics. Phys. Rev. 106, 620 (1957).
Article ADS MathSciNet Google Scholar
Tsallis, C., Mendes, R. & Plastino, A. R. The role of constraints within generalized nonextensive statistics. Phys. A: Stat. Mech. Appl. 261, 534–554 (1998).
Article Google Scholar
Morales, P. A. & Rosas, F. E. Generalization of the maximum entropy principle for curved statistical manifolds. Phys. Rev. Res. 3, 033216 (2021).
Article CAS Google Scholar
Valverde-Albacete, F. & Peláez-Moreno, C. The case for shifting the Rényi entropy. Entropy 21, 46 (2019).
Article ADS PubMed PubMed Central Google Scholar
Umarov, S., Tsallis, C. & Steinberg, S. On aq-central limit theorem consistent with nonextensive statistical mechanics. Milan. J. Math. 76, 307–328 (2008).
Article MathSciNet Google Scholar
Wong, T.-K. L. & Zhang, J. Tsallis and rényi deformations linked via a new λ-duality. IEEE Trans. Inf. Theory 68, 5353–5373 (2022).
Article Google Scholar
Guisande, N. & Montani, F. Rényi entropy-complexity causality space: a novel neurocomputational tool for detecting scale-free features in EEG/iEEG data. Front. Comput. Neurosci. 18, 1342985 (2024).
Article PubMed PubMed Central Google Scholar
Jauregui, M., Zunino, L., Lenzi, E. K., Mendes, R. S. & Ribeiro, H. V. Characterization of time series via rényi complexity–entropy curves. Phys. A: Stat. Mech. Appl. 498, 74–85 (2018).
Article Google Scholar
Wong, T.-K. L. Logarithmic divergences from optimal transport and rényi geometry. Inf. Geom. 1, 39–78 (2018).
Article MathSciNet Google Scholar
Vigelis, R. F., De Andrade, L. H. & Cavalcante, C. C. Properties of a generalized divergence related to Tsallis generalized divergence. IEEE Trans. Inf. Theory 66, 2891–2897 (2019).
Article MathSciNet Google Scholar
Amari, S.-I. Information Geometry and its Applications Vol. 194 (Springer, 2016).
Roudi, Y., Dunn, B. & Hertz, J. Multi-neuronal activity and functional connectivity in cell assemblies. Curr. Opin. Neurobiol. 32, 38–44 (2015).
Article CAS PubMed Google Scholar
Montúfar, G. in Information Geometry and Its Applications: On the Occasion of Shun-ichi Amari’s 80th Birthday, IGAIA IV Liblice, Czech Republic, June 2016, (eds Ay, N., Gibilisco, P. & Matúš, F.) 75–115 (Springer, 2018).
Nakano, K. Associatron-a model of associative memory. IEEE Trans. Syst. Man Cybern. 3, 380–388 (1972).
Article Google Scholar
Amari, S.-I. Learning patterns and pattern sequences by self-organizing nets of threshold elements. IEEE Trans. Comput. 100, 1197–1206 (1972).
Article MathSciNet Google Scholar
Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl Acad. Sci. 79, 2554–2558 (1982).
Article ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Amit, D. J. Modeling Brain Function: the World of Attractor Neural Networks (Cambridge University Press, 1989).
Coolen, A. C., Kühn, R. & Sollich, P. Theory of Neural Information Processing Systems (OUP Oxford, 2005).
Coolen, A. In Handbook of Biological Physics (eds Moss, F. & Gielen, S.) Vol. 4, 553–618 (Elsevier, 2001).
Coolen, A. In Handbook of Biological Physics (eds Moss, F. & Gielen, S.) Vol. 4, 619–684 (Elsevier, 2001).
Mattis, D. Solvable spin systems with random interactions. Phys. Lett. A 56, 421–422 (1976).
Article ADS Google Scholar
Kochmański, M., Paszkiewicz, T. & Wolski, S. Curie–Weiss magnet—a simple model of phase transition. Eur. J. Phys. 34, 1555 (2013).
Article Google Scholar
Amit, D. J., Gutfreund, H. & Sompolinsky, H. Storing infinite numbers of patterns in a spin-glass model of neural networks. Phys. Rev. Lett. 55, 1530 (1985).
Article ADS CAS PubMed Google Scholar
Bovier, A., Gayrard, V. & Picco, P. Gibbs states of the Hopfield model with extensively many patterns. J. Stat. Phys. 79, 395–414 (1995).
Article ADS MathSciNet Google Scholar
Talagrand, M. Rigorous results for the Hopfield model with many patterns. Probab. theory Relat. fields 110, 177–275 (1998).
Article MathSciNet Google Scholar
Shcherbina, M. & Tirozzi, B. The free energy of a class of Hopfield models. J. Stat. Phys. 72, 113–125 (1993).
Article ADS MathSciNet Google Scholar
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images (University of Toronto, 2009).
Fontanari, J. F. & Theumann, W. On the storage of correlated patterns in Hopfield’s model. J. Phys. 51, 375–386 (1990).
Article Google Scholar
Agliari, E., Barra, A., De Antoni, A. & Galluzzi, A. Parallel retrieval of correlated patterns: from Hopfield networks to Boltzmann machines. Neural Netw. 38, 52–63 (2013).
Article PubMed Google Scholar
Sherrington, D. & Kirkpatrick, S. Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792 (1975).
Article ADS Google Scholar

Download references

Acknowledgements

The authors thank Ulises Rodriguez Dominguez for valuable discussions on this manuscript. M.A. is funded by a Junior Leader fellowship from ‘la Caixa’ Foundation (ID 100010434, code LCF/BQ/PI23/11970024), John Templeton Foundation (grant 62828), Basque Government ELKARTEK funding (code KK-2023/00085) and Grant PID2023-146869NA-I00 funded by MICIU/AEI/10.13039/501100011033 and cofunded by the European Union, and supported by the Basque Government through the BERC 2022-2025 program and by the Spanish State Research Agency through BCAM Severo Ochoa excellence accreditation CEX2021-01142-S funded by MICIU/AEI/10.13039/501100011033. P.A.M. acknowledges support by JSPS KAKENHI Grant Number 23K16855, 24K21518. F.R. is supported by the UK ARIA Safeguarded AI programme and the PIBBSS Affiliatership programme. H.S. is supported by JSPS KAKENHI Grant Number JP 20K11709, 21H05246, 24K21518, 25K03085.

Author information

Authors and Affiliations

BCAM – Basque Center for Applied Mathematics, Bilbao, Spain
Miguel Aguilera
IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
Miguel Aguilera
Research Division, Araya Inc., Tokyo, Japan
Pablo A. Morales
Centre for Complexity Science, Imperial College London, London, UK
Pablo A. Morales
Sussex AI and Sussex Centre for Consciousness Science, Department of Informatics, University of Sussex, Brighton, UK
Fernando E. Rosas
Department of Brain Sciences and Centre for Complexity Science, Imperial College London, London, UK
Fernando E. Rosas
Center for Eudaimonia and Human Flourishing, University of Oxford, Oxford, UK
Fernando E. Rosas
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS), Prague, Czech Republic
Fernando E. Rosas
Graduate School of Informatics, Kyoto University, Kyoto, Japan
Hideaki Shimazaki
Center for Human Nature, Artificial Intelligence, and Neuroscience (CHAIN), Hokkaido University, Sapporo, Japan
Hideaki Shimazaki

Authors

Miguel Aguilera
View author publications
Search author on:PubMed Google Scholar
Pablo A. Morales
View author publications
Search author on:PubMed Google Scholar
Fernando E. Rosas
View author publications
Search author on:PubMed Google Scholar
Hideaki Shimazaki
View author publications
Search author on:PubMed Google Scholar

Contributions

M.A., P.A.M., F.E.R., and H.S. designed and reviewed the research and wrote the paper. M.A. contributed the analytical and numerical results. P.A.M. contributed part of the analytical results of the replica analysis.

Corresponding author

Correspondence to Miguel Aguilera.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Luca Ambrogioni, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aguilera, M., Morales, P.A., Rosas, F.E. et al. Explosive neural networks via higher-order interactions in curved statistical manifolds. Nat Commun 16, 6511 (2025). https://doi.org/10.1038/s41467-025-61475-w

Download citation

Received: 02 September 2024
Accepted: 23 June 2025
Published: 24 July 2025
DOI: https://doi.org/10.1038/s41467-025-61475-w