Introduction

Equilibrium statistical physics is a well-understood field with powerful tools available to thoroughly describe any system in equilibrium1. Fundamentally, equilibrium is associated with the time reversal symmetry of the dynamics or in the language of Markov chains, with detailed balance (or micro-reversibility)2. One of the most important consequences of this property is the strong link between dynamics and statics. In fact, the forces driving the dynamical behavior derive from an energy function E. The steady state reached at long times is characterized by the Boltzmann-Gibbs distribution with the same energy function E.

However, many, if not most, real-world systems operate away from equilibrium: detailed balance is violated, and the link between dynamics and statics does not hold anymore. While small deviations from equilibrium can be effectively described using, for instance, linear response theory and perturbation theory3, systems far from equilibrium pose significant challenges because traditional tools are not easily applicable. The main difficulty is that even though forces and dynamical equations are known, there are no general principles allowing for the determination of the steady state probability distribution. One can define an effective energy function, Eeff, as minus the logarithm of the steady state distribution, but except for very few cases, e.g., ref. 4, this effective energy cannot be obtained analytically, and its general properties are unknown. Contrary to equilibrium, there is no longer a simple relationship between the forces driving the dynamics and Eeff. For instance, even if the forces are short-ranged, there is no guarantee for Eeff to be short-ranged. Indeed, Eeff has been shown to contain medium and long-range interactions in the few cases in which an exact analysis was achieved4. These interactions could play a major role in giving rise to the new and striking phenomena observed in out-of-equilibrium systems, such as, e.g., flocking5, motility-induced macrophase6 and microphase separation7, and turbulence8. In consequence, characterizing the energy function and its interactions across scales is a major theoretical challenge.

This paper aims to address this challenge by employing the Wavelet Conditional Renormalization Group (WCRG) method9,10,11, to construct Eeff from realizations of the system. WCRG uses wavelet theory to decompose samples at different scales, allowing for a well-conditioned inference of the parameters of a parametric energy model. We apply this method to obtain the effective energy description of Active Model B+ (AMB+), which is a general continuum field theory describing isotropic bulk phases of self-propelled particles12, such as microorganisms or synthetic micro-swimmers. This model is a generalization of model B, which was introduced to analyze the dynamics of equilibrium critical phenomena13. It offers a suitable framework for our work, as its study is interesting on its own, and at the same time, it offers a challenging but simple and general enough setting for the application of WCRG to non-equilibrium systems. The outcome of the WCRG analysis is a quantitatively accurate model of the effective energy, which identifies interactions across scales in terms of multiscale long-range scalar potentials. Our results establish a link between emergent medium-long-range interactions in Eeff and the non-equilibrium nature of the steady states. In fact, we show that these interactions are responsible for the micro-phase separation induced by activity, and that their range is directly connected to the scale associated to entropy production patterns and Fluctuation-Dissipation theorem (FDT) violations. The emergence of long-range correlations, and possibly long-range interactions, is a central feature of non-equilibrium systems, studied in refs. 6,14,15,16,17,18,19,20,21,22,23 for active systems. It is also a major difficulty for inferring effective energies, as it leads to singular estimation problems9. Our approach leverages wavelet decomposition and multiscale decomposition of the probability law to stabilize the estimation problem, opening the way to a characterization of the long-range interactions leading to large-scale phenomena associated to non-equilibrium systems.

Results

Wavelet-conditional RG and effective energy models

As discussed in the introduction, here we are interested in using data to estimate the steady state distribution and the effective energy of many-body out-of-equilibrium systems. One of our aims is to show how this task can be achieved by the Wavelet Conditional Renormalization Group (WCRG)9,10,11. WCRG is a data-driven method, which proceeds as an inverse renormalization group procedure: it goes from large scales down to microscopic scales, and it gains information from data at each scale to eventually reconstruct the microscopic effective energy.

The procedure is illustrated in Fig. 1, which shows the iterative wavelet decomposition of an initial physical field φ0 into a coarse-grained field φJ and wavelet fields \({\overline{\varphi }}_{j}\). This decomposition allows us to rewrite the full probability law, p0(φ0), in a multiscale way:

$${p}_{0}({\varphi }_{0})={p}_{J}({\varphi }_{J}){\prod }_{j=1}^{J}{\overline{p}}_{j}({\overline{\varphi }}_{j}| {\varphi }_{j})\,.$$
(1)

WCRG focuses directly on the conditional probability distribution for the wavelet fields \({\overline{p}}_{j}({\overline{\varphi }}_{j}| {\varphi }_{j})\) at each scale. Their estimation is numerically stable, and their sampling is fast because at each scale they only contain high frequencies9. WCRG can then be used for systems exhibiting long-range correlations, for which a direct estimation of the effective microscopic energy would fail. The model for the conditional probabilities is low-dimensional, in some cases nearly Gaussian10,11, and is physics-informed.

Fig. 1: Wavelet decomposition of the AMB+.
figure 1

The iterative wavelet decomposition, or fast wavelet transform, (in green) iteratively decomposes the field φj−1,with length scale 2j−1, into a coarser approximation φj, and 3 wavelet coefficient images \({\overline{\varphi }}_{j}\), with sub-sampled low-pass filtering G and high-pass filtering, along different directions, \(\overline{G}\). It can be inverted (in blue), and φj−1 can be recovered from \(({\varphi }_{j},{\overline{\varphi }}_{j})\), with the transpose operators \(({G}^{{{{\rm{T}}}}}{\overline{G}}^{{{{\rm{T}}}}})\). Illustration is done over a realization of AMB+ at parameters (λξ) = (1, 4), which sets the amount of activity of the model.

Once all the conditional probabilities are estimated, one has full knowledge of the probability distribution p0(φ0). One can then sample new data efficiently, as shown in ref. 9. Even more importantly, one has access to the interactions between degrees of freedom across scales. In particular, using the model for \({\overline{p}}_{j}({\overline{\varphi }}_{j}| {\varphi }_{j})\) introduced in ref. 9, one can obtain the effective energy for the microscopic field φ0 as

$${E}_{{{{\rm{eff}}}}}(\varphi )={\varphi }_{0}^{{{{\rm{T}}}}}{K}_{0}{\varphi }_{0}+{\sum }_{j=0}^{J}{V}_{j}({\varphi }_{j})\,.$$
(2)

The first term is a quadratic non-local interaction. In simple cases, it reduces to the discrete version of a short-range differential operator, e.g., a Laplacian. The V0 term is a local potential contribution V0(φ0) = ∑xV0(φ0(x)). The other terms are analogous to V0 but act on the coarse-grained versions of the field (and the sum over x runs on the coarse-grained lattice). They represent progressively longer-range interactions, which, as we shall show, play an important role out of equilibrium. By construction, they do not contain any linear and quadratic terms to avoid redundancy with the two previous terms.

The estimation of the conditional probabilities on each scale can be performed in parallel. Rather than minimizing the Kullback-Leibler (KL) divergence, which is computationally expensive, we minimize the score (the gradient of the logarithm of the probability distribution), which is much easier to do. Technical details are laid out in the “Methods” section and in depth in Appendix C of11. A noteworthy point is that WCRG can be adapted to the regularity of the field by choosing a suitable type of wavelet, as discussed in more detail in the “Methods” section.

Active Model B+

Active matter has been one of the most studied research topic in non-equilibrium physics in recent years24,25,26. Active matter systems are studied in biology, soft-matter physics, and animal behaviors. They are formed by several interacting microscopic degrees of freedom (bacteria, cells, vibrated particles, insects, birds, …) which use and dissipate energy to move or exert mechanical forces. In consequence, active systems violate microscopic reversibility (local detailed balance) and are intrinsically out of thermal equilibrium. They display a wide diversity of new behaviors compared to their equilibrium counterparts. In order to study the change due to activity on critical and collective behaviors, several works have studied active versions of equilibrium models introduced for critical dynamics2,13. In this work, we focus on one of the most studied systems: Active Model B+ (AMB+)27. As the name suggests, it is a generalization of the well-known Model B2,13, describing the stochastic equilibrium dynamics of the conserved scalar φ4 field theory. AMB+ is obtained by adding to it all terms allowed by symmetries up to order \({{{\mathcal{O}}}}({\nabla }^{4}{\varphi }^{2})\)

$${\partial }_{t}\varphi = -\nabla \cdot ({{{\boldsymbol{J}}}}+\sqrt{2DM}{{{\mathbf{\Lambda }}}})\\ {{{\boldsymbol{J}}}}/M =-\nabla \left[\frac{\delta {{{\mathcal{F}}}}}{\delta \varphi }+\lambda | \nabla \varphi {| }^{2}\right]+\xi ({\nabla }^{2}\varphi )\nabla \varphi \\ {{{\mathcal{F}}}}[\varphi ] =\int\left\{-\frac{{\varphi }^{2}}{2}+\frac{{\varphi }^{4}}{4}+\frac{K}{2}| \nabla \varphi {| }^{2}\right\}{{{\rm{d}}}}{{{\boldsymbol{x}}}}\\ \langle {\Lambda }^{m}({{{\boldsymbol{x}}}},t){\Lambda }^{n}({{{{\boldsymbol{x}}}}}^{{\prime} },{t}^{{\prime} })\rangle ={\delta }^{mn}\delta ({{{\boldsymbol{x}}}}-{{{{\boldsymbol{x}}}}}^{{\prime} })\delta (t-{t}^{{\prime} })\,,$$
(3)

where Λ(xt) is Gaussian white noise with zero mean and unit variance, and D is the strength of thermal fluctuations (proportional to the temperature). This is the most general model for a conserved field at this order. In the following, for simplicity, we set the coefficients K and M to constants equal to one.

It has been shown that AMB+ can be obtained by coarse-graining the microscopic dynamics of Active Ornstein-Uhlenbeck particles27. Note that Eq. (3) is a multidimensional Langevin equation in which the force, − J, is not conservative. This is due to the parameters λ and ξ, which control the activity of the model and break local detailed balance. The system does reach a steady state at long times, but it is a non-equilibrium one that does not verify the time-reversal symmetry (TRS). In this work, all numerical results are obtained from a discretized square lattice version of Eq. (3). The mean value of the field is conserved by the dynamics; we chose it equal to 0.6 in order to focus on phase separation regimes. Additional details on the simulations are discussed in Methods.

The main advantage of AMB+ in the context of this paper is that the amount of activity is tunable, via the two parameters λ and ξ. Moreover, the model displays a phase transition which, depending on the values of λ and ξ, can be equilibrium-like or show new features related to the violation of time-reversal symmetry28,29,30. Figure 2 display the state points on which we have focused in the (λξ) plane. In our study, we have fixed D = 0.45, which is below the transition temperature Dc = 0.54 of the passive model, but still retains a substantial amount of thermal fluctuations. We show in Fig. 2 the zero-noise D = 0 separation line between the “effectively passive" and the “effectively active" phases (P and A phase, respectively, on the figure), and snapshots of the corresponding phases (adapted from ref. 27). For low level of activity, the large-scale phase separation resembles what happens in equilibrium (top right panels). When the amount of activity increases, the system displays reversed Ostwald ripening and micro-phase separation (bottom right panels). The transition between the two regimes can be traced to the change of the sign of the pseudotension, which is a generalization of the equilibrium surface tension. It determines the Laplace pressure jump at interfaces. In the A phase, this pseudotension can be negative, leading to reverse Ostwald ripening and, hence, microphase separation. For an in-depth discussion, we refer the reader to ref. 27.

Fig. 2: Macroscopic phase separation behavior of the Active Model B+.
figure 2

a Phase diagram of AMB+ in the mean-field regime D = 0 (where D is the noise amplitude), adapted from ref. 27. The P, for passive, phase is where the activity has no significant effect on the large-scale phase separation; standard Ostwald ripening is observed. The A, for active, phase is where activity affects the large-scale steady state phase separation: reversed Ostwald ripening leads to the stabilization of smaller structures and drives the system to microphase separation. Squares, triangles, and disks correspond to the set of parameters we used to generate the training dataset for the model. The equilibrium case is indicated in black. Going from greener to bluer hues corresponds to an increase in activity. b For the three specific sets of parameters (λ = 0, ξ = 0), (λ = −0.5, ξ = 1), (λ = 1, ξ = 4), where (λξ) tune the amount of noise in the model, the field on the left is a representative example from the dataset, the one on the right is a sample generated from the learned WCRG model. Blue indicates negative values of the field, while green indicates positive values.

In the following, we will apply the WCRG to the AMB+ in the different regions of the phase diagram. Our aim will be to identify the effective energy for each state point and relate its main features to the non-equilibrium behavior of the system.

Emergence of long-range interactions

We now present our first main result: the effective energy inferred from realizations of the system exhibits, when activity is strong, long-range interactions that lead to micro-phase separation.

For each state point of Fig. 2, we have produced 5000 independent steady state configurations for a lattice of linear size L = 128. Using these configurations as input, we have estimated the conditional probabilities, the quadratic kernels, and the multiscale potentials introduced in Eq. (2). For details on simulations and WCRG procedures, see “Methods”. Figure 3 presents results for the quadratic kernel K0. On the left, we show K0 in real space for the passive case, which is at thermal equilibrium; it agrees with the discretized Laplacian Δij we have used in the numerical simulations. We use a nine-points stencil for the Laplacian: Δij = 0.5(δi−1,j−1 + δi−1,j+1 + δi+1,j−1 + δi+1,j+1) − 2(δi,j−1 + δi,j+1 + δi−1,j + δi+1,j) + 6δi,j. Kernels for the other studied values of the activity parameters (λξ) are available in the Supplementary Information Fig. S6. On the right, we compare for all (λξ) of Fig. 2 the corresponding Fourier power spectra. They are numerically very close to the one corresponding to the passive case, thus showing that K0 is a discretized Laplacian, or very close to it, for every state point. The increase in activity, therefore, has no effect on this part of the effective energy.

Fig. 3: Inferred quadratic part.
figure 3

a Estimated Gaussian kernel K0 for the passive setting, obtained when the two activity parameters λ and ξ are set to 0. It matches closely a discrete nine-point stencil Laplacian. Positive values are in yellow, negative ones in dark blue. b Power spectrum of the kernel K0, for different values of the activity parameters (λξ), as a function of the wave number amplitude q. All kernels have a very similar spectrum.

The situation is very different for the local potentials Vj. Each Vj corresponds to a coarse-grained scale j = 27−ja, where a is the lattice spacing, which henceforth we fix to one by adjusting the unit of length. As previously discussed, each multiscale potential encodes information about interactions on the scale j. For instance, V3 corresponds to interactions on a range corresponding to 3 = 8 sites. Figure 4 shows Vj for all different state points.

Fig. 4: Multiscale potentials.
figure 4

V0: Finer scale scalar potential for a different set of parameters. V1, V2, V3 and V4: Multi-scale long-range scalar potentials for different sets of activity parameters (λξ), at increasing j. These potentials correspond to interactions on a length scale of j = 2j, so from left to right  = 1,  = 2,  = 4,  = 8,  = 16. For the equilibrium system (dashed black line), only the microscopic V0 potential is non-zer,o as expected. For each j, the x-axis is squeezed into [0, 1], with the transformation \({\tilde{\varphi }}_{j}=\frac{{\varphi }_{j}-\min ({\varphi }_{j})}{\max ({\varphi }_{j})-\min ({\varphi }_{j})}\), where φj is the coarse-grained field at scale 2j. The max and min are estimated over the dataset. We do not rescale the y-axis, and define \({V}_{j}({\tilde{\varphi }}_{j})={V}_{j}({\varphi }_{j})\). We plot \(\left({\tilde{\varphi }}_{j},{V}_{j}({\tilde{\varphi }}_{j})\right)\) over the window \({\tilde{\varphi }}_{j}\in [0,1]\).

In the equilibrium case (black dashed line), we clearly see that the only significantly non-zero term is V0 (the small non-zero values of Vj>0 are due to estimation errors). This is exactly what is expected: the invariant probability distribution in equilibrium is \(\exp (-{{{\mathcal{F}}}}/D)/{Z}_{D}\), and hence the effective potential V0 coincides with the one present in the dynamical equations through \({{{\mathcal{F}}}}\), as reported in the Supplementary Information Fig. S4, whereas all other potentials are zero. In equilibrium, short-range interactions in the dynamical equations imply short-range interactions in the energy function. For the cases with non-zero activity, the situation is different. Two main behaviors arise. For a relatively low level of activity (greener hues curves), which corresponds to systems behaving macroscopically similarly to the equilibrium case, as shown in Fig. 2, all the potentials, from V1 and up, stay small, close to the zero baseline of the equilibrium case. Instead, for high activity (bluer hues curves), the potentials develop significant contributions up to V3. V4 and higher potentials are essentially zero except for atypical values of the field (for which the estimation is poor and the error large). Note that finite-size effects are also likely to play a role and can explain why V4 does not vanish more clearly. Quantifying such effects would require studying and comparing larger and smaller sizes of models, a problem which we leave for further works.

These potentials are crucial to stabilize micro-phase separation. In fact, an energy function with a Laplacian K0 and potentials as the V0 and V1 in Fig. 4, which locally favor one value of the field (the positive one), would lead to macroscopic phase separation. The potentials V2 and V3 counteract the effect of V0 and V1 by favoring the other (negative) value of the field at a coarser level. The net result is a blocking of macroscopic phase separation, which is instead replaced by the formation of bubbles with length-scales up to 3, as shown in Fig. 2. The effective potentials are thus asymmetric, and this asymmetry gets more pronounced when the activity level increases. Notice that one does not expect the effective potentials or energy function to be symmetric, because the dynamical equation defining AMB+ is not symmetric with respect to a change of sign of φ. The proper symmetry is instead (φλξ) → − (φλξ). Only in the strictly passive case, when (λξ) =  (0, 0), one finds that Eeff ≈ V0 is indeed symmetric.

We find, therefore, two main results. First, a stabilization mechanism of micro-phase separation, which is different from the one found in many equilibrium theories31. In the latter, the minimum of K0 at a non-zero wave-vector leads through the quadratic term in the energy to micro-phase separation. In our case, instead, the form of K0 is standard (it has a minimum at zero wave-vector), whereas they are the long-range coarse-grained potentials (not containing any quadratic contribution by construction), which play the crucial role. Second, once TRS is broken, even though forces are local, the interactions in the effective energy describing the non-equilibrium steady state become long-range for strong activity. Arguably, this is a key ingredient that allows the system to display phenomena that would be otherwise unexpected in equilibrium.

We have checked the quality of the model estimated by WCRG, and its correctness, testing that in the equilibrium passive case, the energy function is directly linked to the forces in the dynamical equations. In the active case, we have sampled scale by scale using the estimated conditional probabilities (running a Metropolis Adjusted Langevin diffusion32), and checked the quality of the image generated. Figure 2 shows side by side a true example of AMB+ extracted from the training dataset (on the left of the right panel) and a sample generated from the learned model for an increasing level of activity. Clearly, in every case, the generated samples are qualitatively very close to the dataset used to train on. This “visual correctness" is actually a difficult test to pass, but it’s also important to show quantitative comparisons. In the Supplementary Information Fig. S5 we present more examples of samples from the model and also compare the Fourier spectrum and the probability distribution of the training and sampled datasets. All these show very good agreement in every situation tested.

The main result of the application of WCRG to AMB+ is that, in the strongly active regime of the AMB+ model, the effective energy inferred from system realizations develops long-range interactions that stabilize micro-phase separation. While the quadratic kernel K0 remains essentially identical to the passive case (a discrete Laplacian), the multiscale local potentials Vj reveal a qualitative change: in equilibrium, only V0 is non-zero, but under strong activity, higher-order terms up to V3 become significant. These long-range, non-quadratic interactions counteract the tendency for macroscopic phase separation and instead promote the formation of stable, finite-sized bubbles. This mechanism differs from equilibrium theories, where micro-phase separation typically arises from non-local quadratic terms. The effective energy thus captures genuinely non-equilibrium behavior. Time-reversal symmetry breaking induces emergent long-range structure, despite the underlying dynamics being local. The validity of the model learned by WCRG is supported by both visual and statistical agreement between real and generated configurations.

Out of equilibrium probes, length-scales, and patterns

We now present the second main result of this work: long-range interactions are directly associated with the out-of-equilibrium nature of the steady state. To this end, we study two ways to probe in detail non-equilibrium behavior. The first is the local entropy production rate. The knowledge of the effective energy enabled by the WRCG approach allows for the direct computation of this quantity. The second is the violation of the FDT as a function of the wave number.

Obtaining the patterns of entropy production33,34 is, in general, a difficult task because it requires knowing the gradient of the effective energy, a quantity which is unknown and difficult to obtain for a non-equilibrium steady state. WCRG, by providing an estimation of Eeff[φ], offers a very direct way to probe the local entropy production field. In fact, once the score\({\nabla }_{\varphi }\log p[\varphi ]=-{\nabla }_{\varphi }{E}_{{{{\rm{eff}}}}}[\varphi ]\) is known, using the force field F = − xJ and the mobility matrix \(\underline{\underline{M}}\) (coming from the discretization of the spatial differential operators), the local entropy production can be expressed as33,35

$$\sigma ({{{\boldsymbol{x}}}},t)=\frac{1}{D}\underline{\underline{M}}(F+D\nabla {E}_{{{{\rm{eff}}}}})\cdot (F+D\nabla {E}_{{{{\rm{eff}}}}})\,.$$
(4)

More details on the derivation of this expression and important remarks on the computation of the different quantities can be found in “Methods”. In Fig. 5, we show σ(xt) (in red) superimposed on the field configuration (in gray scale) for realizations associated with three different state points. The first panel corresponds to the equilibrium case, the second one to an active system in a regime in which no long-range interactions are present in the effective energy. In both cases, the entropy production is absent or minimal. Note that in the case (λξ) = (−0.5, 1) even though the microscopic dynamics contains terms breaking micro-reversibility, the system effectively behaves like if it was in equilibrium. The third and fourth panels correspond to cases where activity leads to long-range interactions. In these cases, entropy is instead produced locally in correspondence to the bubbles generating the micro-phase separation, thus on the scale of the effective long-range interactions. It is mostly localized on the boundaries between the bubbles and the bulk, as found by several previous studies36,37. Note that because the temperatures we are investigating are not as low as in other studies, the localization at the boundary is less sharp. The typical scales on which the entropy is produced are similar between the two most active cases. For the (λξ) = (1, 4) case, the amount of entropy production is greater, and the typical length-scale is slightly larger. This is consistent with the fact that long-range potential V3 is greater in the most active case.

Fig. 5: Entropy production rate.
figure 5

Examples of the entropy production rate for different sets of model parameters (λξ). Their value is displayed for each panel and corresponds to different macroscopic behaviors: a equilibrium (λξ) = (0, 0), b below phase transition (λξ) = (−0.5, 1), and c, d above phase transition (λξ) = (0.75, 3.5) and (λξ) = (1, 4). The gray-scale images encode the field value at each point. The superimposed color maps show the local entropy production rate, red for positive values, blue for negative ones, and transparent close to zero. As expected, the entropy rate is nearly everywhere positive or vanishing. The rate is significant only for a high degree of activity and stays minimal in the effectively passive regime. The entropy is produced mostly at the boundaries of the bubbles. This rate is obtained by computing the instantaneous production rate σ and averaging it over a short time, short enough not to modify significantly the spatial structures.

It is interesting to compare the method we propose to compute the entropy production rate to other approaches that have been developed in the literature. In36, the authors start from a path integral representation and obtain from it an exploitable form of the entropy production rate. This approach is very direct and amenable to analytic studies, but it requires a case-by-case field theory analysis. In contrast, the approach we used is more direct, in the sense that it only requires the knowledge of the dynamical equations of the system in order to connect the score function, i.e. \({\nabla }_{\varphi }\log p[\varphi ]\), (directly obtained from WCRG) to the entropy production. Our method is therefore more prone to experimental applications. Indeed, one future direction we would like to study is to apply the method we developed to data from experimental systems. The method developed in ref. 37 defines a local entropy production by dividing the system into small blocks, small enough to contain a single particle. Then, using the result from information theory, it constructs an estimator of the entropy production based on the cross-parsing complexity. This method requires time trajectories of the system, which is not the case with our procedure, which only relies on independent snapshots of the system. An advantage of the approach37 is that it is directly applicable to experimental data. Finally35, used deep neural networks to estimate the score and then used an equation similar to (4) to estimate local entropy production. This work is more similar to ours methodologically. The main difference is that we use an interpretable form of the score based on WCRG, whereas they use a more expressive but not interpretable representation based on neural networks.

Analyzing the violation of the FDT has been a protocol used in several physical situations to characterize out-of-equilibrium systems. In fact, FDT is directly related to the time-reversal symmetry of the dynamics. The way in which FDT is violated informs on the out-of-equilibrium nature of the system, as originally proven for spin-glasses38, and recently for active systems39. One form of FDT is2,40

$$\chi (q,t)=\frac{1}{D}\left[C(q,0)-C(q,t)\right]\,,$$
(5)

where C(qt) and χ(qt) are respectively the autocorrelation function and the integrated response function of the density field, at wave-vector q and lag time t; D is the equilibrium temperature. The autocorrelation is computed by standard methods, using trajectories of the field obtained by numerical integration of the dynamics. To compute the response, we used a perturbation-free approach, based on the Malliavin weight technique. We detailed the procedure in Methods. Following38,39, we study in Fig. 6 the parametric plot of χ(qt) as a function of C(qt) for various values of q in the case of strong activity. In equilibrium one should find a straight line with slope  −1/D.

Fig. 6: Parametric representation of the Fluctuation-Dissipation theorem at different wave numbers q for activity parameters (λξ) = (1, 4).
figure 6

The integrated response function χ(qt) at wave number q is plotted as a function of the autocorrelation function C(qt) at wave number q. The typical length scale corresponding to wave number q is denoted  = 2π/q. The dashed line corresponds to the equilibrium FDT, where D is the noise amplitude. The deviation is the strongest at intermediate q values, which correspond to the size at which the effective potentials display long-range interactions and to the size of the bubbles in the micro-separated phase.

This is actually close to what we find for high and low spatial wave-vectors, i.e., small and large length-scales. Instead, when considering length-scales and state points at which the effective potentials display long-range interactions, we find that FDT is strongly violated. Concomitantly, the behavior of C(qt) is also peculiar, displaying a very slow relaxation for those modes q’s (see Fig. 7 in “Methods”). For state points corresponding to weak activity (effectively passive regime of Fig. 2), we don’t find any substantial violation of FDT for all wave-vectors, as illustrated in the Supplementary Information Fig. S1.

Fig. 7: Autocorrelation function.
figure 7

Spatial Fourier transform of the autocorrelation function C(qt) of the AMB+ as a function of the time lag, for several wave numbers and for activity parameters (λξ) = (1, 4). A logarithmic scale is used for time. Low and high length scale modes show a familiar exponential decay. However, intermediate scales have a logarithmic relaxation.

In summary, local entropy production patterns and FDT violation both highlight the connection between the effective out-of-equilibrium behavior and the emergence of long-range interactions in Eeff. For state points in which the system displays standard macroscopic phase separation, both out-of-equilibrium markers are absent or minimal, and concomitantly, the long-range interactions are not present. In this regime, the system is in a steady state, which appears to be effectively in equilibrium despite the presence of terms breaking micro-reversibility in the equation of motion. A similar behavior is discussed in the case of active mixtures in ref. 41. Instead, in the part of the phase diagram in which the system displays micro-phase separation, entropy is produced and FDT is violated. Both probes indicate that the scale over which out-of-equilibrium effects are maximal corresponds to the long range over which interactions identified in Eeff act, and to the size of bubbles characterizing the micro-separated phase.

We conclude this section by clarifying the different types of input required for the analyses discussed above. Inferring the effective energy using the WCRG method requires only a dataset of representative steady-state snapshots of the system. In our study, these snapshots were generated by integrating the dynamics of AMB+, but access to the underlying dynamics is not essential—such snapshots could just as well come from experimental observations of a physical system. Computing the entropy production requires prior knowledge of how the score function (i.e., \({\nabla }_{\varphi }\log p\)) relates to entropy production, which depends on the system’s dynamical equations. Once this relationship is known (as in our case via Eq. (4)), the score—and thus the entropy production—can be directly extracted from the effective energy, again using only snapshot data. On the other hand, studying violations of the FDT through autocorrelation and response functions requires access to time-series data, i.e., sufficiently long trajectories of the system’s evolution.

Discussion

The interpretable model of the effective energy obtained by the WCRG allows us to connect the emergence of medium-long-range interactions to characteristic out-of-equilibrium behaviors of a general model of scalar active matter, the AMB+. Moreover, the form of the local potentials across scales offers a new perspective on the micro-phase separation out of equilibrium shown by active systems. Another application of WCRG, which was not studied in this work, but that would be worth exploring, is a data-driven renormalization group theory for non-equilibrium steady states. In fact, by following the flow of Eeff across scales, one can characterize critical points and the associated relevant operators. It was shown in ref. 9 that for the standard equilibrium φ4 field theory, WCRG is indeed able to describe the RG critical fixed point.

The key ingredient of WCRG is the ansatz for the form of the scale-dependent effective energy. The analysis of more complex active systems and of multiscale out-of-equilibrium states, such as turbulence, will certainly require more general ansatzes. The proposal in ref. 11 could provide a robust low-dimensional description sufficient to describe two-dimensional turbulence. Modern machine learning generative methods applied to non-equilibrium physical systems, see e.g.42, are very powerful but lack interpretability, which is a key challenge to gain physical insights. The method presented in this paper is a step towards addressing this problem.

Methods

Wavelet bases

Orthogonal wavelet filters

We use an orthogonal decomposition that separates high from low frequencies, using two conjugate low and high-pass discrete filters \((g,\overline{g})\). We define the orthogonal operator \((G,\overline{G})\), with a convolution and a subsampling

$$G\varphi (n)=\varphi * g(2n)\,\,\,{{\rm{and}}}\,\,\,\overline{G}\varphi (n)=\varphi * \overline{g}(2n)\,.$$

In dimension d, conjugate mirror filters are computed as separable products of one-dimensional conjugate mirror filters \((g,\overline{g})\)43. For a signal φ of size Ld, G computes a low frequency map of size (L/2)d whereas \(\overline{G}\) computes 2d − 1 high frequency maps of size (L/2)d. For images, d = 2, \(\overline{G}\) computes vertical, horizontal, and diagonal details.

Fast Wavelet transform

We decompose the field φ0 into its wavelet decomposition \({\left({\overline{\varphi }}_{j},{\varphi }_{J}\right)}_{1\le j\le J}\) by iteratively applying orthogonal operators \((G,\overline{G})\). The field φj−1, with length scale 2j−1, is decomposed into a coarser approximation φj, and 2d − 1 wavelet coefficient \({\overline{\varphi }}_{j}\) such that

$${\varphi }_{j}=G{\varphi }_{j-1}\,\,{{{\rm{and}}}}\,\,{\overline{\varphi }}_{j}=\overline{G}{\varphi }_{j-1}\,.$$

The finer scale field φj−1 can be recovered from \(({\varphi }_{j},{\overline{\varphi }}_{j})\) with the transposed operators \(({G}^{{{{\rm{T}}}}},{\overline{G}}^{{{{\rm{T}}}}})\) with

$${\varphi }_{j-1}={G}^{{{{\rm{T}}}}}{\varphi }_{j}+{\overline{G}}^{{{{\rm{T}}}}}{\overline{\varphi }}_{j}\,.$$

Asymptotic wavelet bases

When j goes to , for appropriate filters \(\overline{g}\) and low-pass filters g, one can prove44 that the iterated wavelet filters Gj = Gj and \({\overline{G}}_{j}={G}^{j-1}\overline{G}\), such that φj =  Gjφ0 and \({\overline{\varphi }}_{j}={\overline{G}}_{j}{\varphi }_{0}\), converges to φ(x) and wavelets ψk(x), up to a dilation by 2j. These limit functions are square integrable. One can prove43 that \({\{{2}^{-j/2}{\psi }_{k}({2}^{-j}x-n)\}}_{n\in {{\mathbb{Z}}}^{d},j\in {\mathbb{Z}},1\le k\le d}\) is an orthonormal basis of \({{{{\mathcal{L}}}}}^{2}({{\mathbb{R}}}^{d})\). Wavelet fields \({\overline{\varphi }}_{j}\) can be rewritten as decomposition coefficients in this wavelet orthonormal basis.

Choice of wavelet basis

The specific choice of wavelet ψ made for the WCRG is important to optimize the performance of the method. Indeed, with the development of wavelet theory, different kinds of bases have been created to represent functions43. One of the most important families of wavelets was invented by Daubechies44 and bears her name. They have the property of having compact support and lead to an orthogonal representation of \({L}^{2}({\mathbb{R}})\) functions. One important feature is that they can be picked to have p vanishing moments, for any \(p\in {\mathbb{N}}\), which means the wavelet is orthogonal to any polynomial of degree p − 1. There is a trade-off between the size of the support and the number of vanishing moments. Simplifying a bit, when trying to represent a field which is quite smooth, it is useful to increase the number of vanishing moments, because it will lead to a sparser wavelet representation43. In the case of AMB+ presented in subsection “Active Model B+”, inspection of the dynamical Eq. (3) shows terms up to order 4 of the field. These high-order gradient terms put strong constraints on the smoothness of the field and require the use of a wavelet with enough vanishing moments. We tested running WCRG using both Daubechies 4 symmlets (a more symmetric version of standard Daubechies) and Daubechies 1 (also known as Haar) wavelets, the order indicating the number of vanishing moments. Using Daubechies 4 gave much better inference performances for the model in the case of AMB+. But the situation was reversed when looking at the simpler φ4 model, which has lower-order gradient terms, so not as much smoothness is imposed by the dynamics. Furthermore, using a wavelet with too many vanishing moments leads to conditional energies much more complicated to model, with non-trivial interactions across scales, and thus approximation error11.

Score matching, free-energy estimation, and MALA algorithms for exponential families

We detail how we construct a parametric exponential model over p from parametric exponential models over the conditional distributions. We describe how to estimate, from data, the parameters and sample from this model. We justify the specific choice of the parametric model used in this paper.

Exponential families

We define an exponential parametric model \({p}_{{\theta }_{J}}({\varphi }_{J})\) of the distribution pJ(φJ) of the coarse-grained field φJ. With an ansatz ΦJ(φJ)

$${p}_{{\theta }_{J}}({\varphi }_{J})={{{{\mathcal{Z}}}}}_{J}^{-1}{e}^{-{\theta }_{J}^{{{{\rm{T}}}}}{\Phi }_{J}({\varphi }_{J})}\,.$$
(6)

\({{{{\mathcal{Z}}}}}_{J}^{-1}\) is a normalizing constant that ensures that \(\int\,{p}_{{\theta }_{J}}({\varphi }_{J})d{\varphi }_{J}=1\). The ansatz \({\Phi }_{J}={({\Phi }_{J}^{k})}_{k}\) is a set of real functions defined over φj.

For any j ≤ J, we define an exponential parametric model \({\overline{p}}_{{\overline{\theta }}_{j}}({\bar{\varphi }}_{j}| {\varphi }_{j})\) of the conditional probabilities \({\overline{p}}_{j}({\bar{\varphi }}_{j}| {\varphi }_{j})\), with ansatz Ψj(ψj−1)

$${\overline{p}}_{{\overline{\theta }}_{j}}({\overline{\varphi }}_{j}| {\varphi }_{j})={e}^{{F}_{j}({\varphi }_{j})-{\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\Psi }_{j}({\varphi }_{j-1})}\,,$$

where Fj is a free energy that normalizes the conditional probability

$$\int\,{\overline{p}}_{{\overline{\theta }}_{j}}({\overline{\varphi }}_{j}| {\varphi }_{j})\,d{\overline{\varphi }}_{j}={e}^{{F}_{j}({\varphi }_{j})}\int\,{e}^{-{\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\Psi }_{j}({\varphi }_{j-1})}\,d{\overline{\varphi }}_{j}=1\,.$$

Each free energy Fj is specified by \({\overline{\theta }}_{j}\), but, as detailed later, it does not need to be computed to estimate \({\overline{\theta }}_{j}\) or sample \({\overline{p}}_{{\overline{\theta }}_{j}}\). Fj is also approximated in a parametric family, with as an ansatz Φj, and parameters αj such that \({F}_{j}\approx {\alpha }_{j}^{{{{\rm{T}}}}}{\Phi }_{j}\).

Similarly to Eq. (1), we define a parametric model pθ(φ) of p(φ) as the product of the conditional parametric models defined above

$${p}_{\theta }(\varphi )=\,{p}_{{\theta }_{J}}({\varphi }_{J}){\prod }_{j=1}^{J}{\overline{p}}_{{\overline{\theta }}_{j}}({\overline{\varphi }}_{j}| {\varphi }_{j})\,,$$

The model \({p}_{\theta }={{{{\mathcal{Z}}}}}_{\theta }^{-1}{e}^{-{E}_{\theta }}\) has a Gibbs energy

$${E}_{\theta }={\theta }_{J}^{{{{\rm{T}}}}}{\Phi }_{J}+{\sum }_{j=1}^{J}\left({\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\Psi }_{j}-{\alpha }_{j}^{{{{\rm{T}}}}}{\Phi }_{j}\right)\,.$$
(7)

Score matching

The parameters \({({\bar{\theta }}_{j},{\theta }_{J})}_{j}\) are regressed from data, such that the distributions \({({\bar{p}}_{{\bar{\theta }}_{j}},{p}_{{\theta }_{J}})}_{j}\) are the closest possible to \({({\bar{p}}_{j},{p}_{J})}_{j}\). Likelihood optimization minimizes the Kullback-Liebler divergence between the distributions. The optimal parameters are the Lagrange multipliers. This requires extensive numerical computations. Score matching provides a computationally scalable alternative to likelihood optimization of a model, which is valid for distributions with a bounded log-Sobolev constant, such as log-concave distributions. For conditional probabilities of scalar field theory, this hypothesis has been experimentally shown to hold9,10,11. Parameters \({\overline{\theta }}_{j}\) are estimated by minimizing a relative Fisher divergence

$$\ell ({\overline{\theta }}_{j})={{\mathbb{E}}}_{{p}_{j-1}}\left(\parallel {\nabla }_{{\overline{\varphi }}_{j}}\log {\overline{p}}_{j}({\overline{\varphi }}_{j}| {\varphi }_{j})-{\nabla }_{{\overline{\varphi }}_{j}}\log {\overline{p}}_{{\overline{\theta }}_{j}}({\overline{\varphi }}_{j}| {\varphi }_{j}){\parallel }^{2}\right).$$

Following a derivation from45, for an exponential family,

$$\ell ({\overline{\theta }}_{j})={{\mathbb{E}}}_{{p}_{j-1}}\left(\frac{1}{2}\parallel {\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\nabla }_{{\overline{\varphi }}_{j}}{\Psi }_{j}({\varphi }_{j-1}){\parallel }^{2}-{\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\Delta }_{{\overline{\varphi }}_{j}}{\Psi }_{j}({\varphi }_{j-1})\right),$$

from which we can derive the closed form

$${\overline{\theta }}_{j} = \, {\overline{M}}_{j}^{-1}{{\mathbb{E}}}_{{p}_{j-1}}\left({\Delta }_{{\overline{\varphi }}_{j}}{\Psi }_{j}({\varphi }_{j-1})\right)\,,\\ \,\,\,{{{\rm{with}}}}\,\,\,{\overline{M}}_{j} = \, {{\mathbb{E}}}_{{p}_{j-1}}\left({\nabla }_{{\overline{\varphi }}_{j}}{\Psi }_{j}({\varphi }_{j-1}){\nabla }_{{\overline{\varphi }}_{j}}{\Psi }_{j}{({\varphi }_{j-1})}^{{{{\rm{T}}}}}\right)\,.$$

Mj is an ill-conditioned quadratic matrix, which is regularized by adding ϵId. θJ, at the coarsest scale, is inferred similarly. Due to the finite size of the datasets, empirical estimations replace the expectancies.

Free energy estimation

The parameters \({\overline{\theta }}_{j}\) estimated from score matching do not define probabilities \({\overline{p}}_{{\overline{\theta }}_{j}}\) that are normalized. Because we approximate \({\nabla }_{{\overline{\varphi }}_{j}}\log {\overline{p}}_{{\overline{\theta }}_{j}}\), our estimate is blind to any function of φj in \(\log {\overline{p}}_{{\overline{\theta }}_{j}}\). Indeed, orthogonality implies \({\nabla }_{{\overline{\varphi }}_{j}}{\varphi }_{j}\). Fj remains to be estimated.

By taking a derivative according to φj in Eq. (6), the free energy and its parameterized approximation are shown11 to minimize the quadratic loss function

$$\ell ({\alpha }_{j})={{\mathbb{E}}}_{{\overline{p}}_{{\overline{\theta }}_{j}}{p}_{j}}\left(| | {\alpha }_{j}^{{{{\rm{T}}}}}{\nabla }_{{\varphi }_{j}}{\Phi }_{j}({\varphi }_{j})-{\overline{\theta }}_{j}^{{{{\rm{T}}}}}{\nabla }_{{\varphi }_{j}}{\Psi }_{j}({\varphi }_{j-1})| {| }^{2}\right)\,.$$

We also derive a closed form for αj

$$\begin{array}{r}{\alpha }_{j}={\tilde{M}}_{j}^{-1}{{\mathbb{E}}}_{{\overline{p}}_{{\theta }_{j}}{p}_{j}}\left({\nabla }_{{\varphi }_{j}}{\Phi }_{j}({\varphi }_{j}){\nabla }_{{\varphi }_{j}}{\Psi }_{j}{({\varphi }_{j-1})}^{{{{\rm{T}}}}}\right){\overline{\theta }}_{j}\,,\\ \,\,\,{{{\rm{with}}}}\,\,\,{\tilde{M}}_{j}={{\mathbb{E}}}_{{\overline{p}}_{{\theta }_{j}}{p}_{j}}\left({\nabla }_{{\varphi }_{j}}{\Phi }_{j}({\varphi }_{j}){\nabla }_{{\varphi }_{j}}{\Phi }_{j}{({\varphi }_{j})}^{{{{\rm{T}}}}}\right)\,.\end{array}$$

We also regularize \({\tilde{M}}_{j}\).

Sampling with MALA

We produce a sample φ of pθ from coarse to fine

  • ■ Initialization: compute a sample φJ of \({p}_{{\theta }_{J}}\).

  • ■ For j from J to 1, given φj compute a sample \({\overline{\varphi }}_{j}\) of \({\overline{p}}_{{\overline{\theta }}_{j}}(\cdot | {\varphi }_{j})\) and set \({\varphi }_{j-1}=G{\varphi }_{j}+\overline{G}{\overline{\varphi }}_{j}\).

The sample φ = φ0 of pθ is obtained by iteratively sampling random high frequencies conditionally on low frequencies. Both \({p}_{{\theta }_{J}}\) and \({\overline{p}}_{{\overline{\theta }}_{j}}(\cdot | {\varphi }_{j})\) are sampled using the Metropolis Adjusted Langevin Algorithm (MALA)32, which does not depend upon the normalization free energy Fj.

Scalar potential ansatz

We detail how the ansatze \({({\Phi }_{j},{\psi }_{j})}_{j}\) are defined. In particular, they define an energy Eθ which is made of multiscale scalar potentials.

At the coarsest scale, Φj includes a two-point interaction matrix KJ and a parametric scalar potential \({V}_{{\gamma }_{J}}\),

$${\theta }_{J}^{{{{\rm{T}}}}}{\Phi }_{J}({\varphi }_{J})=\frac{1}{2}{\varphi }_{J}^{{{{\rm{T}}}}}{K}_{J}{\varphi }_{J}+{V}_{{\gamma }_{J}}({\varphi }_{J})\,,$$

with θJ = (KJγJ). This ansatz is inspired by the φ4 energy.

The interaction Gibbs energy of \({\overline{p}}_{{\theta }_{j}}({\overline{\varphi }}_{j}| {\varphi }_{j})\) includes two-point interactions within the high frequencies \({\overline{\varphi }}_{j}\), between high frequencies \({\overline{\varphi }}_{j}\) and the lower frequencies φj, with convolution matrices \({\overline{K}}_{j}\) and \({\overline{K}}_{j}^{{\prime} }\), plus a scalar potential

$${{\overline{\theta }}_{j}}^{{{{\rm{T}}}}}{\Psi }_{j}({\varphi }_{j-1})={\overline{\varphi }}_{j}^{{{{\rm{T}}}}}{\overline{K}}_{j}{\overline{\varphi }}_{j}+{\overline{\varphi }}_{j}^{{{{\rm{T}}}}}{\overline{K}}_{j}^{{\prime} }{\varphi }_{j}+{\overline{V}}_{{\overline{\gamma }}_{j}}({\varphi }_{j-1})\,.$$

with \({\overline{\theta }}_{j}=({\overline{K}}_{j},{\overline{K}}_{j}^{{\prime} },{\overline{\gamma }}_{j})\).

Finally, for the free energy ansatz, like for the coarsest scale

$${{\alpha }_{j}}^{{{{\rm{T}}}}}{\Phi }_{j}({\varphi }_{j-1})={\varphi }_{j}^{{{{\rm{T}}}}}{\tilde{K}}_{j}{\varphi }_{j}+{V}_{{\tilde{\gamma }}_{j}}({\varphi }_{j})\,.$$

with \({\alpha }_{j}=({\tilde{K}}_{j},{\tilde{\gamma }}_{j})\). Under such parametrization, one can derive the scalar potential energy ansatz in Eq. (2) from Eq. (7).

$${E}_{{{{\rm{eff}}}}}(\varphi ) = \,{\varphi }_{0}^{{{{\rm{T}}}}}{K}_{0}{\varphi }_{0}+{V}_{{\overline{\gamma }}_{1}}({\varphi }_{0})\\ +{\sum }_{j=1}^{J-1}\left({V}_{{\overline{\gamma }}_{j+1}}({\varphi }_{j})-{V}_{{\tilde{\gamma }}_{j}}({\varphi }_{j})\right)+{V}_{{\gamma }_{J}}({\varphi }_{J})-{V}_{{\tilde{\gamma }}_{J}}({\varphi }_{J})\,.$$

where K0 is defined by recursively computing \({K}_{j-1}={G}^{{{{\rm{T}}}}}({K}_{j}-{\tilde{K}}_{j})G+{\overline{G}}^{{{{\rm{T}}}}}{\overline{K}}_{j}^{{\prime} }G+{\overline{G}}^{{{{\rm{T}}}}}{\overline{K}}_{j}\overline{G}\).

We leverage the invariance to translations of the system. Scalar potentials are averaged over sites, Vγ(φ) = ∑ivγ(φ[i]), with scalar potential vγ(t) = ∑kγkρk(t) decomposed over a finite approximation family \({\{{\rho }_{k}(t)\}}_{k}\) with coefficients \(\gamma ={({\gamma }_{k})}_{k}\). We use translated sigmoids: \({\rho }_{k}(t)=1/(1+{e}^{(t-{t}_{k})/{\sigma }_{k}})\). In numerical applications, there are 25 evenly spaced translations tk, on the support of the distribution of each φj(n), and \({\sigma }_{k}=\frac{3}{2}({t}_{k+1}-{t}_{k})\). We parametrize the kernels to be local and only consider short-range interactions. This amounts to θ of approximate size 400.

Regression of linear part

Due to the fact that the dynamical evolution we consider conserves the spatial mean of the field, there is a gauge ambiguity in the scalar potential Vj. At each scale, they are defined up to a linear part ajφj + bj. To remove this ambiguity, at each scale, we are regressing and then subtracting away any linear part in the potentials obtained after the training procedure. It is these potentials that are presented in the main text in Fig. 4.

The gauge ambiguity emerges from the fact that for a given sample of the field, the sum over all lattice sites is equal to the same quantity (due to the conservation of the mean imposed by the dynamics). This property is also true for the coarse-grained fields φj. This means that a linear term in the effective energy would lead to a field-independent term. Indeed, a linear term at scale j is of the form ajφj + bj = ajxφj(x) + bj = ajmj + bj where mj is the average value of the field at the scale j, which is a fixed quantity for the system we consider.

Active B+ model simulation

To produce the different training datasets, the AMB+ was numerically simulated for different sets of parameters. The equation of motions (3) where discretized (with Δt = 0.01, Δx = Δy = 1.0), then integrated using a simple temporal Euler scheme, where the white noise was discretized as a normally distributed random vector. Following27, the different spatial operators were represented as finite difference approximations of high enough degree to properly capture the correct behavior. Periodic boundary conditions have been used in both directions.

The datasets used for the inference of the WCRG model are constructed following a conservative procedure used for models with slow dynamics. We proceeded as follows: we started from a uniform initial condition, then let the system evolve to reach its steady state (the burn-in time was monitored by looking at different observables, such as the moments of the distribution, and waiting for them to reach a steady-state value). Then we sample the system 50 times every 50000 time steps (with a time discretization Δt = 10−2). This timescale corresponds to the relaxation time of the slowest modes (defined as the time at which C(t) has decreased by half of its value at t = 0). Moreover, to improve the statistics, we run in parallel 100 independent runs—each of them is sampled as described above. This procedure should decrease the variance due to the slowest mode and enable a correct statistical estimation.

Correlation of AMB+

We computed the autocorrelation of the AMB+ in the most active case considered in this study (λξ) = (1, 4) for several wave numbers and over a wide range of time-lags, to see how the different modes decorrelate. Figure 7 illustrates once again the importance of the intermediate length scale modes. Indeed, contrary to the high and low length scale modes, which show a standard exponential decay of their autocorrelation, these intermediate modes have rather a logarithmic decay. This kind of behavior can be a marker of a wide range of out-of-equilibrium effects. In this case, it is probably due to the strong long-range interactions acting on these modes.

The autocorrelation for the effectively passive sector (λξ) = (−0.5, 1), is available in the Supplementary Information, Fig. S2 and illustrates that in this case, all modes decay exponentially albeit with a large range of timescales.

FDT, Malliavin weight for computing the response

To show a violation of the FDT, it is necessary to access the autocorrelation and the linear response function of the system. The autocorrelation is straightforward and can be estimated efficiently in the steady state by slicing the system into independent sequences and averaging over both these slices and several realizations of the system, to reduce somewhat the overhead of letting the system equilibrate to the steady state from the initial condition. The response is more difficult because naively, it requires explicitly perturbing the system with an external field and then computing an estimate of the derivative of the average value in the limit of a vanishingly small perturbation field. This is notoriously difficult. However, alternative methods exist, which do not require to explicitly perturb the system. They rather introduce a supplementary variable, called the Malliavin Weight (due to its link with Malliavin calculus), which evolves alongside the system and allows to compute response to change of parameters46,47. This approach was already used to study active matter systems39. However, to exploit it in the context of the present paper, working with fields, a generalization of the method is necessary. Extended details about the use of these approaches in the context of field theories will be made available in a future technical note, but the main result is the following. To compute the response function of AMB+, an auxiliary field is tracked during the simulation of the model. This Malliavin field qα(xt) evolves according to a Langevin equation

$$\frac{\partial {q}_{\alpha }(x,t)}{\partial t}=\frac{1}{\sqrt{2D}}\int\,{{{\rm{d}}}}{x}^{{\prime} }\frac{\delta {H}_{\alpha }({x}^{{\prime} },t)}{\delta {H}_{\alpha }(x,t)}\nabla \cdot \Lambda ({x}^{{\prime} },t)\,,$$

where Hα is an external field added to the other forces in the dynamics for the field φ, and Λ is the same noise realization as the one used to simulate the φ dynamics. Using the same realization is crucial, because otherwise the Malliavin field would be uncorrelated to the field φ. Equipped with qα, computing the response function simply amounts to taking the average of it times the field

$$\chi (x,t)=\langle \varphi (x,t){q}_{\alpha }(x,t)\rangle \,.$$

It is straightforward to consider the Fourier transform of the Malliavin weight to compute the response as a function of the wave number.

Entropy production

Computation from stochastic thermodynamics

To compute the entropy production rate, we follow classic results from the field of stochastic thermodynamics, presented for instance in ref. 33. We start from a general vector Langevin equation for the stochastic vector x

$$\dot{{{{\boldsymbol{x}}}}}=\underline{\underline{M}}{{{\boldsymbol{F}}}}+{{{\boldsymbol{\eta }}}}\,,$$

where F is the deterministic force, \(\underline{\underline{M}}\) is the mobility matrix, and η is a noise vector with correlation

$$\langle {{{\boldsymbol{\eta }}}}(t):{{{\boldsymbol{\eta }}}}({t}^{{\prime} })\rangle =2\underline{\underline{D}}\delta (t-{t}^{{\prime} })\,.$$

\(\underline{\underline{D}}\) It is called the diffusion matrix. With these notations, the Fokker-Planck equation for the probability distribution p associated with this Langevin equation is

$${\partial }_{t}p=-\nabla {{{\boldsymbol{j}}}}=-\nabla (\underline{\underline{M}}{{{\boldsymbol{F}}}}p-\underline{\underline{D}}\nabla p)\,.$$

Generalizing slightly the result of33 to vectorial equations, the overall entropy production rate can be computed as

$$\dot{S} = \int\,{{{\rm{d}}}}{{{\boldsymbol{x}}}}{{{\boldsymbol{j}}}}{\underline{\underline{D}}}^{-1}\frac{{{{\boldsymbol{j}}}}}{p}\\ = \int\,{{{\rm{d}}}}{{{\boldsymbol{x}}}}(\underline{\underline{M}}{{{\boldsymbol{F}}}}p-\underline{\underline{D}}\nabla p){\underline{\underline{D}}}^{-1}\frac{(\underline{\underline{M}}{{{\boldsymbol{F}}}}p-\underline{\underline{D}}\nabla p)}{p}\,.$$

We consider a system in a non-equilibrium steady state, and we write the associated probability distribution using the effective energy \({p}_{{{{\rm{ss}}}}}({{{\boldsymbol{x}}}})={Z}^{-1}{e}^{-{E}_{{{{\rm{eff}}}}}({{{\boldsymbol{x}}}})}\), which means that pss = −pss Eeff, where Eeff is the score. The entropy production becomes

$$\begin{array}{rcl}\dot{S}&=&\int\,{{{\rm{d}}}}{{{\boldsymbol{x}}}}(\underline{\underline{M}}{{{\boldsymbol{F}}}}+\underline{\underline{D}}\nabla {E}_{{{{\rm{eff}}}}}){\underline{\underline{D}}}^{-1}(\underline{\underline{M}}{{{\boldsymbol{F}}}}+\underline{\underline{D}}\nabla {E}_{{{{\rm{eff}}}}}){p}_{{{{\rm{ss}}}}}\\ &=&{\langle (\underline{\underline{M}}{{{\boldsymbol{F}}}}+\underline{\underline{D}}\nabla {E}_{{{{\rm{eff}}}}}){\underline{\underline{D}}}^{-1}(\underline{\underline{M}}{{{\boldsymbol{F}}}}+\underline{\underline{D}}\nabla {E}_{{{{\rm{eff}}}}})\rangle }_{{p}_{{{{\rm{ss}}}}}}\,.\end{array}$$

We need to recover equilibrium when the activity coefficients are set to zero. This imposes the usual Einstein relation \(\underline{\underline{D}}=D\underline{\underline{M}}\), where D is the noise amplitude, introduced in the main text. This simplifies the expression for the entropy rate and means that it is not necessary to invert the diffusion matrix to compute it. We finally obtain the form discussed in the main text

$$\dot{S} = \frac{1}{D}{\langle \underline{\underline{M}}({{{\boldsymbol{F}}}}+D\nabla {E}_{{{{\rm{eff}}}}})({{{\boldsymbol{F}}}}+D\nabla {E}_{{{{\rm{eff}}}}})\rangle }_{{p}_{{{{\rm{ss}}}}}}\\ = \,{\langle \sigma ({{{\boldsymbol{x}}}},t)\rangle }_{{p}_{{{{\rm{ss}}}}}}\,,$$

where we introduced the local entropy production rate σ(xt), which is the quantity computed in the main text.

Discretization and mobility matrix

To adapt the results presented above to the study of field theories, one simply needs to explicitly discretize space. This is, anyway, what has to be done to carry out any numerical analysis. The values of the field at each discretization point are then arranged in a vector, which was denoted x above. For the study of AMB+, the mobility matrix \(\underline{\underline{M}}\) is taken to encode the finite difference representation of the gradient operators. It is a larger matrix, but it is very sparse, so it can be efficiently stored, and matrix vector products are also fast to compute.

Numerical estimation

To plot the entropy production in Fig. 5, we performed a short-time average of successive time points to reduce the instantaneous noise in the entropy and have a more defined entropy production rate structure. These successive time points span an interval of Δt = 0.1. Comparing this time to the autocorrelation given in Fig. 7, we see that this time interval is small enough to have nearly no relaxation of any of the modes. This means that the system stays basically the same, apart from very high-frequency fluctuations coming from the stochastic noise, which is exactly what we wish to filter out.