Introduction

When is a system more than the sum of its parts? When and how do the properties of active components enable the emergence of a high-level, integrated decision-making entity1,2,3,4,5? These questions bear on issues in ecology, philosophy mind, psychiatry, swarm robotics, and developmental biology6,7,8,9,10,11,12,13. In a sense, all intelligence is collective intelligence14,15 because even human minds supervene on a collection of cells which are themselves active agents. One practical way to define integrated emergent systems is by the fact that they have goals, memories, preferences, and problem-solving capabilities that their parts do not have. For example, while individual cells solve problems in metabolic, physiological, and transcriptional spaces, what makes an embryo more than a collection of cells is the alignment of cellular activity toward a specific outcome in anatomical morphospace16. Here, we focus on one aspect of emergent agency: integrated, distributed memory.

When a rat learns to press a lever to receive a reward, the cells at the paw touch the lever, those in the gut receive the delicious food - no individual cell has both experiences. The “rat” is the owner of the associative memory that none of its parts can have. This ability to bind together individual experiences of their parts is a hallmark of emergent agents. The rat can do associative learning because it has the right causal architecture (implemented by the nervous system) to integrate information across space and time within its body. However, this ability is not unique to brainy animals—various kinds of problem-solving and learning occur in single cells (reviewed in refs. 17,18,19) because biology fundamentally exploits a multi-scale competency architecture in which the molecular components within a cell are likewise integrated to provide system-level context-sensitive responses.

Regardless of specific material implementation, certain functional topologies exhibit high emergent integration. In recent years, this topic has moved from philosophical debates over supervenience and downward causation to empirical science, as quantitative methods have been developed to measure a degree to which a system is more than its parts and possesses higher levels of organization that do causal work distinct from its lowest level mechanisms20,21,22,23. This now enables a study of the relationship between minimal cognition and collective intelligence. A degree of integration among parts is required for any amount of cognitive function, such as learning. Here, we explore the inverse hypothesis: could the process of learning increase integration within a system? That is, could training a system reify and strengthen the existence of it as a unified, emergent, virtual governor24?

To study this question in the most minimal model system, in which all the components are well-defined, deterministic, and transparent, we chose Gene Regulatory Networks. GRN models represent sets of gene products that up- or down-regulate each other’s activity based on a given functional connectivity map25. These networks are very important topics in biomedicine26,27,28, evolutionary developmental biology29,30,31, and synthetic biology32,33,34,35. It is essential to be able to not only predict their behaviors, but also to induce desired dynamics for interventions in regenerative medicine and bioengineering36,37,38,39,40,41. These networks are well-recognized to have emergent properties42,43,44,45,46,47,48, yet their control remains challenging49,50,51,52,53,54,55. We recently showed that such networks can be trained by providing stimuli on chosen nodes and reading out responses on other nodes. Well-known paradigms from behavioral and cognitive science can be used to predictably change future responses as a function of experience. Biological networks show several different kinds of learning, including Pavlovian conditioning. Thus, we sought to measure emergent integration in these networks before, during, and after training, to determine what effect the induction of memory has on the network as a coherent agent.

To quantify this property, we took advantage of tools from neuroscience, where several measures of “integrated information” have been proposed to explain how the activity of individual neurons gives rise to a unified emergent mind56,57,58,59. Different flavors of integrated information have been applied to study information dynamics in, among the others, natural evolution60, genetic information flow61, and biological and artificial neural systems62, and have even been suggested as indicators of “consciousness”63. Without seconding any of its claims about consciousness, we adopt the framework of Integrated Information Decomposition (\(\varPhi {ID}\))64 to specifically define a measure of causal emergence: it quantifies to what extent the whole system provides information about the future evolution that cannot be inferred by any of its individual components, or, in other words, the extent to which the system behaves as a collective whole65. Intuitively, the higher causal emergence, the stronger the integration (or inseparability) of a collective of components. \(\Phi {ID}\) provides a rigorous framework to study causal emergence in a variety of different systems, from Conway’s game of life66 to human and primate neural dynamics67, including for the comatose68 and the brain-injured69.

Here, we analyze in silico simulations showing how causal emergence increases in response to associative training in specific GRNs, characterize the GRNs’ integration behaviors, and uncover a relationship between this phenomenon and the underlying biology (phylogeny and gene ontology of specific networks).

Results

We adopted the recent framework of Integrated Information Decomposition \(\Phi {ID}\)64, an established approach to quantify causal emergence, in the dynamics of 29 GRNs (described as Ordinary Differential Equations, or ODEs) from the BioModels database70, before and after training the networks for an associative memory task. We assessed how learning strengthens causal emergence in basal biological systems like regulatory networks, and how this results in qualitatively different patterns of behavior of the GRNs related to the biology (phylogeny and gene ontology) of the networks.

We illustrate how associative memory works in Fig. 1A, B. For each network, we pre-tested every triplet of its nodes (a circuit in which we assign nodes to the roles of unconditioned stimulus (UCS), a neutral stimulus (NS), and a response (R)) to determine whether it passed the test for associative memory70; similarly to Pavlovian conditioning in animals. We did so by: ensuring that stimulating the UCS alone triggers an increase in R, ensuring that stimulating the NS alone does not trigger R, and then: (1) “relaxing” the network in the initial phase; (2) stimulating both unconditioned and neutral stimuli during the subsequent training phase; (3) verifying that after the paired exposure, stimulation of the NS alone regulated the R (see “Materials and Methods”). Out of all the circuits, 808 (among 19 networks) passed this pretest and we considered these for all subsequent analyzes.

Fig. 1: Learning in gene-regulatory networks.
figure 1

A Pavlovian conditioning and applications to gene regulatory networks. The standard paradigm is that the Conditioned Stimulus (CS) is initially neutral in that it does not elicit a Response (R), while an Unconditioned Stimulus (UCS) does. Some systems, like dogs, eventually show a response to the CS after it has been presented together with the UCS (forming an association between those stimuli). The same paradigm has been used with GRNs and pathways by mapping the CS, UCS, and R roles to specific nodes in the network and stimulating them by transiently raising their activity level (for example, by increasing their expression via a chemical ligand). Associative memory requires a degree of integration within the network, and here we explore the hypothesis that associative conditioning reifies the learning agent by increasing its emergent integration metrics (schematized by the dog progressively becoming more of a unified, centralized agent and less a collection of cells). Reported with permission from86. B Associative conditioning in simulated gene regulatory networks; simulation time is on the x-axis, while gene expression levels are on the y-axis. During the training phase, we pair UCS and CS stimuli to regulate a response R. If associative conditioning has taken place, we observe the CS alone (i.e., with no UCS stimulation) regulates R. In the schematic, we illustrate stimulation/regulation as upticks of the expression levels over a baseline value; in reality, gene expression can have a quantitatively different shape, but the principle remains the same. Used with permission from87. C How causal emergence (quantified with Integrated Information Theory64) changes over simulated time in two sample gene regulatory networks. For the network on top, it increases due to associative training; for the network on bottom, on the other hand, the same is not true. The diverse relationships between training and causal emergence are studied across the set of networks we analyzed in subsequent figures.

We then applied the \(\Phi {ID}\) to compute causal emergence on the genes’ expression signals, exploiting it as an exhaustive measurement of all the ways a macroscopic (i.e., whole network) feature could affect the future of any parts of the network (including the network itself). Intuitively, this definition quantifies the degree to which the whole system influences the future in a way not discernible by considering the parts only (schematized by the dog of Fig. 1 progressively becoming more of a unified, centralized agent and less a collection of cells). Causal emergence is a numerical quantity measured in natural units of information; the higher it is, the more “emergent” the system is, in the sense that the more the “macro” beats the “micro” in explaining the system dynamics71,72. In Fig. 1C, we report a key finding: that associative training raises causal emergence in most of the GRNs studied.

Emergence increases after training

We set out to study how causal emergence changes before and after training GRNs for associative memory, and specifically to test the conjecture that the experience of learning an associative task raises the causal integration of the system. Because we also wanted to know whether biological networks have any unique properties in this regard, we constructed a set of 145 random networks as controls, by the established gene circuit method (see “Materials and Methods”)71, which involves randomizing the topology and connection strengths of these pseudo-biological networks with different random seeds. We computed the average % change in causal emergence from before to after “training” with paired stimuli (Fig. 2A), where each point corresponds to one circuit of one network for biological and random networks. We also plotted in Fig. 2B the ratio of biological GRNs that had any associative memory and witnessed an increase in causal emergence from before to after “training” in most networks. Finally, we verified the change in causal emergence was persistent by simulating the networks for longer without any stimulation and found the trend to persist and not be different from what was observed during the test phase.

Fig. 2: Biological networks with memory exhibit an increase of causal emergence during training and do so more than random networks.
figure 2

A Error bars for the % change in causal emergence from before to after training for biological and random networks; each point corresponds to one circuit of one network. Biological networks are significantly more causally emergent after training. The asterisks above the brackets indicate significance at p < 0.001 with the Mann-Whitney test. B Bars for the % of those networks that have memory and those that, if they have memory, show an increase in causal emergence after training. Associative training results in increased causal emergence among almost all networks with memory.

In total, causal emergence was strengthened in 17 out of 19 networks. The change amounted, on average, to a 128.32 ± 81.31% increase from before to after training. This result was significant (p < 0.001) with the Wilcoxon signed-rank test for paired data, indicating that the two samples of before and after training came from statistically different distributions, confirming that training for associative memory increased causal emergence for most of the networks studied. Random networks had an average change of 56.25 ± 51.40% and the difference with biological networks was significant (p < 0.001). Instead, random networks had higher levels of absolute causal emergence before training. In other words, random networks started with higher emergence but did not increase it, while biological networks started with lower emergence but increased it with experience.

Previous network structure and function classifications do not capture emergence

We next sought to test whether our \(\Phi {ID}\) results captured new information about these networks that could not be derived with other previously existing metrics. To this end, we characterized the GRNs along existing structural and functional metrics. For all the circuits with memory, we computed established properties from network theory (in-degree, out-degree, betweenness centrality, PageRank, and HITS scores) for the structure and from dynamical systems literature (sample entropy, Lyapunov exponents, correlation dimension, detrended fluctuation analysis, and generalized Hurst exponent) for the function. We found no significant or only weak correlation with the change in causal emergence (Fig. 3A, B). These findings reveal that our experimental protocol uncovers new information that existing classifications of networks do not capture: how training GRNs affects causal emergence.

Fig. 3: Change in causal emergence is not correlated with established structure, function, and activity metrics for networks.
figure 3

Kendall’s rank correlation coefficient between % change in causal emergence and network A structure properties, B function properties, and C activity (measured as the first derivative of the state variables). Our causal emergence metric captures an aspect that neither established network properties nor mere activity encompass: how gene regulatory networks react to training.

We next tested whether \(\Phi {ID}\) corresponds to mere network “activity” (i.e., checking that periods of low integration did not merely correspond to quiescent periods of low signaling). We characterized overall network “activity” as the first derivative of the ODE state variables and found no correlation with % change in causal emergence (Fig. 3C), revealing that causal emergence is not the same as network activity. If, for example, causal emergence spikes up, it is not an artifact of increased network activity. Similarly, if causal emergence drops, it is not attributable to a drop in signaling among nodes. In other words, we found that the dynamics of causal emergence are distinct from the levels of native or induced activity within the networks.

Automatic classification of emergence trajectories into behaviors finds five “species”

We observed several different ways in which integration of a network changed due to the training phase (Fig. 4). We wondered if the effects of training on integration across networks were highly diverse (forming a smooth space of possible effects of training), or whether there would be discrete categories of effects which define a kind of “species” with respect to how GRNs’ causal emergence affected training. We first described each test phase’s causal emergence trajectory using behavioral descriptors; alternatively, we could have extracted learned features through a neural network, but these would have been less interpretable72. We settled on seven descriptors we found (after manual search) to be, at the same time, the most expressive (in terms of quality of the classification) and compact: trend, monotonicity, flatness, number of peaks, average distance among peaks, average difference among peaks, and range (see “Materials and Methods” for detailed procedure).

Fig. 4: GRNs exhibit distinct patterns of causal emergence with respect to the effects of training.
figure 4

Panels show five sample causal emergence trajectories (one gene regulatory network per row) of the five automatically discovered behaviors during the test phase. Each behavior represents trajectories sharing a specific and distinct temporal pattern.

We applied k-means73, an established unsupervised learning technique, to automatically classify the extracted behavioral descriptors from each test phase of the circuits that have memory. This technique revealed discrete clusters in terms of the behavior descriptors (how emergence changes due to training), in effect classifying networks into “species” of individuals that exhibited similar effects of training upon their causal emergence. We tuned the number of species to be the optimal one according to the Silhouette coefficient of quality of a clustering. We found five optimal “species” of behavior and nicknamed them homing, inflating, deflating, spiky, and steppy after their characteristics (see next section). We plotted the t-SNE74 embedding in 2D of the descriptors of each test phase trajectory, colored by the assigned behavior, in Fig. 5. We found that behaviors corresponded to very clear separation in the t-SNE embedding, with some overlapping around the intersection of the five behaviors. The Silhouette coefficient of 0.5 indicates a good partitioning, i.e., a partitioning that is significantly different from random assignment (in which case the coefficient would be 0). We report the value counts per species in Table 1. This analysis showed that the species distribution was uneven, with a slight plurality of homing individuals, and Steppy being the rarest of all species among the GRNs that have memory.

Fig. 5: Effects of training on causal emergence occur in five distinct types.
figure 5

t-SNE embedding in 2D of the behavior descriptors for the test trajectories, as classified by the k-means unsupervised learning algorithm; each point corresponds to the test phase for one circuit of one network. The behaviors segregate well into distinct clusters.

Table 1 Number of memory circuits per behavior class

Finally, we tested whether different circuits within the same network have different behaviors or consistently fall into one; in other words, we tested whether the same network can have multiple different “personalities” with respect to housing circuits that respond differently to training. We performed a chi-squared test for independence between the network identifier and its behavior label, which revealed (p < 0.001) that circuits belonging to the same network preferentially adopted one species of behavior.

Visualization of the five behaviors reveals relevant patterns

We then set out to study how the five types of behaviors differed. We plotted five sample trajectories for each behavior in Fig. 4, where the x-axis corresponds to simulated time. Each behavior represented trajectories sharing a distinct pattern, as if the GRN agents were adopting a specific behavior in the emergence space. Homing trajectories frequently oscillated around their mean, as if the GRN was “numb”. Inflating individuals had an overall positive trend, as the name suggests, as if training made them more and more emergent, whereas the opposite was true for deflating individuals. The spiky behavior consisted of a few periodic, extreme bursts of emergence that left the trend unchanged in the long term. Finally, Steppy was about having a few prolonged (not short, like inflating) bursts of emergence as if they were strides.

The descriptor histograms in Fig. 6 illustrate the effects of training on emergence. They show, for example, how homing individuals were non-monotonous and flat; spiky individuals were also flat but had, on average, more distanced peaks and a larger range; finally, deflating individuals were more negatively monotonous, whereas inflating ones had positively monotonic effects on causal emergence.

Fig. 6: Causal emergence types are characterized by different traits.
figure 6

Histograms for the seven descriptors (features of causal emergence during testing; one per row) by the five automatically discovered causal emergence types (behaviors; one per column). Behaviors correspond to different descriptors’ distributions, meaning that our classification captures different manners in which causal emergence reacts to training.

Behaviors differ by phylogeny and gene ontology

Having seen that the integrated nature of biological networks displayed different responses to training, we wondered if these classes corresponded to distinct biology (phylogeny and gene ontology of the network): might networks belonging to different types of processes or species exhibit different responses with respect to how much they are reified by the associative conditioning? We extracted this information directly from the BioModels website and visualized the results in Fig. 7 with heatmaps. Figure 7A shows the relative occurrence (i.e., all the behaviors sum to 1) of the five automatically classified behaviors by phylogeny (top row) and gene ontology (bottom row). Some cells are marked with “N/A” because not every behavior is represented in every phylogeny or gene ontology. From the tables, we saw that there existed a relationship between phylogeny (or gene ontology) and behavior occurrence. For example, lower vertebrates (which mostly include Xenopus laevis) have the highest diversity, followed by insects and plants, whereas mammals show the least. Slime molds broke down similarly to mammals, but it was hard to draw conclusions from that comparison because only one network fell under this taxon (from Physarum polycephalum), but we still included its results for completeness. When looking at gene ontology, the MAPK cascade and the mitotic cell cycle were the most diverse, and stem cell differentiation was the least. Similarly to slime molds, some gene ontologies (the far-red right signaling in P. polycephalum and the sucrose biosynthetic process in Saccharum officinarum) were represented by only one network, but we included their results for completeness, even though few inferences could be made.

Fig. 7: Relationship between behavior class and the biological nature of each network.
figure 7

A Two-way table of relative occurrences of behaviors for circuits (ways in which emergence changes upon training) for each taxon (top row) and gene ontology (bottom row); columns sum to one. B Two-way table of average % causal emergence change from before to after training for each taxon and gene ontology, including the margins (averages over rows/columns). Some cells are marked as ”N/A” because not every taxon or gene ontology is represented for a given behavior. The occurrence of behavior is related to taxa and gene ontologies87.

A similar result was revealed analyzing the average (across the circuits) % change in causal emergence from before to after training in Fig. 7B, which includes the margins. Plants showed the greatest increase in emergence and insects the greatest decrease, while also having the highest behavior diversity (Fig. 7A). On the other axis, the sucrose biosynthetic process showed the greatest increase in causal emergence and the regulation of the circadian rhythm showed the greatest decrease. We performed a chi-squared test for independence in two-way tables for all the tables of Fig. 7 and found the results to be significant, confirming our hypothesis that the specific ways in which causal emergence is potentiated by learning correlate with specific phylogeny and gene ontology of the networks.

Discussion

How could a system’s supervenience over its parts be increased? Here we showed that re-engineering its hardware architecture (physical topology) is not required to accomplish this. Rather, by using an in silico model of a minimal agent, we found that training a network for associative memory can increase its integrated causal power. We also demonstrated another surprising result: the relationship between causal integration and learning can be bi-directional: not only is integration needed for a system to be capable of associating the experiences of its parts into associative learning in the collective, but it turns out that conversely, associative conditioning experiences can potentiate the collectivity and integration of networks.

We have shown that these results hold for a specific type of system, ODE GRNs, which has relevance to mechanisms necessary for physiological regulation and cognition in animals, both at the evolutionary and individual behavioral scale. Our work was motivated by two considerations. The first was aimed at the nascent field of diverse intelligence and unconventional cognition75, seeking to understand the dynamics necessary and sufficient for the emergence of integrated selves on a scale from the most minimal matter to human metacognition and beyond76. The second is a roadmap toward new ways to manipulate biological matter for biomedical and bioengineering applications, that moves beyond rewiring biochemical details toward programming, communicating with, and motivating living tissues toward desired system-level outcomes6.

Network theory77 and dynamical systems theory78 provide metrics to analyze networks, including computational models of GRNs79. These existing tools allow the inference of biological pathways from data80, as well as algorithms for predicting how they will respond to new inputs81. While very useful32,35,82, advances are limited by the assumption that structure fully explains function83,84: a view of molecular pathways as mechanical machines inevitably focuses attention on the hardware and approaches to modify it, such as CRISPR, protein engineering, and the editing of promoters to create novel connections between them. However, network rewiring is hard and time-consuming, with many challenges for the synthetic biologist85 or the designer of gene therapies. Recent research has shown that GRNs can demonstrate a variety of unanticipated behaviors, such as associative memory86,87—and that such behavior arises from changes in the signaling within a specific network rather than changes in its wiring (which has significant implications for the development of biomedical applications that use patterns of stimuli and do not rely on gene therapy88).

More broadly, the field of diverse intelligence research17,18,89,90,91,92 seeks to understand cognition in unfamiliar guises and implementations. Beyond traditional studies of organisms with brains, it seeks to understand ways in which learning, decision-making, and different degrees of intelligent problem-solving can be implemented in a wide range of media. Especially important are ways in which evolutionary and engineering processes can scale up the basal cognitive capacity of minimal active matter76 and lead to the emergence of new agents that are in crucial ways “more than the sum of their parts”. Thus, in addition to the biomedical motivation for finding new ways to induce and manipulate memories in GRNs, we seek to use memory in gene regulatory networks as a model system in which to understand the origin of unconventional, minimal cognitive systems. By expanding concepts from neuroscience (such as measures of integration) to novel substrates93, we hope to understand the relationship between learning and the causal structures that implement active agents composed of parts (i.e., all of us). In doing so, we found a surprising result, in which training reifies the causal potency of a distributed system.

Training a biological network for associative memory increases causal emergence by 128.32%, on average, meaning a factor of almost two-and-a-half. This does not happen for every network: some GRNs remain as they were natively, before their training, while others (the vast majority) adapt based on what they experience. Borrowing an analogy from circuit components, some GRNs behave like resistors (which function the same regardless of their history or the frequency of the incoming signal), while others behave like memristors94 (which have a high degree of hysteresis and are also frequency-dependent). Interestingly, the specific way in which causal emergence rises after training is similar within the various circuits of a given network, suggesting that a network has a consistent “personality” with respect to how stimulation of its various input nodes affects its integrated nature.

This increased causal emergence contrasts dramatically with what we observed in random networks, which have an average increase of only 56.25%. This result suggests that evolution may have selected biological networks to be more responsive to training in a way that is not reducible to random network dynamics. In other words, unlike some generic network properties (attractors, stability, etc.) that are found in even random networks95, the ones we describe have been (directly or indirectly) reinforced by the processes of life96. There was no correlation between causal emergence and mere GRN activity, meaning that our findings cannot be derived from established metrics on networks. This result is consistent with studies on integrated information and psychedelics97,98, which have shown that measurements of “how much activity” exists in the brain do not correlate well with the richness of corresponding conscious experience.

We automatically categorized causal emergence trajectories into behaviors and investigated how the behavior depends on the biology of the network, in particular, phylogeny and gene ontology. We found a dependence between the two, with different phylogenies and gene ontologies responding with different characteristic behaviors. However, we found no relationship between evolutionary history and behavior. Mammals and slime molds are, respectively, the most recent and one of the most ancient taxa in our study, yet they share low levels of diversity and relatively higher levels of increase in causal emergence. Plants (an ancient taxon99) and insects (a less ancient taxon100) have, respectively, the greatest positive and negative change of causal emergence. In the future, we will investigate what happens when considering a wider repertoire of ontologies for each phylogeny. Crucially, there is also no relationship between “intelligence” when measured as the number of neurons (if any)101 and the effects of training on integration. Mammals (which, in our study, mostly consist of human GRNs) are dominated by plants (that do not even have a nervous system99) in terms of causal emergence change, and slime molds (having the most primitive nervous systems102) dominate insects. Subsequent research may identify other biological parameters that map more tightly to the different classes of response that we found in these GRNs (but what we know already is that popular ways to categorize networks are not sufficient to capture the dynamics we observed).

The relationship between causal emergence and gene ontology also deserves mention. The MAPK cascade is not only the most diverse ontology here, but it also corresponds to an almost doubling of causal emergence after training. This finding is relevant considering that, in addition to regulating response to a wide array of stimuli and being found in most eukaryotes103, this pathway pre-organizes pathway segments so that they respond faster and stronger to subsequent stimuli—that is, that they form new memories more readily104. MAPK is especially interesting given its central role in stress response and memory, which are very relevant as a kind of cognitive glue14,105,106 that helps bind active subunits (such as cells and molecular networks) toward a common purpose in multicellular organisms navigating a range of problem spaces16.

Similarly, the mitotic cell cycle (the second most diverse ontology) also plays a key role in the reproduction of every cell in the organism. On the other hand, the regulation of the circadian rhythm results in the greatest negative change in emergence after training, contrary to what happens with most ontologies. One possible reason is that the circadian GRNs are not persuadable, or, in other words, hold to their priors more strongly, but it is hard to draw conclusions in the absence of more experiments. In general, our intuitions are limited by the subset of networks analyzed and would be stronger if tested on a larger and more diverse pool of GRNs, consisting of, for example, phylogenies not considered so far in this work. It is a limitation of this area of inquiry that biologically accurate, fully parameterized network models are not plentiful.

There are several essential areas for future work. ODEs are a convenient formalism to study continuous-time GRNs107, but it is possible they provide spurious behaviors (within certain parameter ranges) that may not map to observable phenotypes108. Also, ODEs model GRNs that operate in isolation and do not consider the biological noise and interactions coming from the intracellular matrix (future work will model the matrix as an environment for receiving and sending feedback to GRNs). Our tests were done in a highly simplified model, focusing on just one layer of biological control, specifically to show the minimal features sufficient to couple learning to increases of causal emergence. Next steps include the integration of this analysis with models of bioelectric109,110,111,112,113,114 and biomechanical115,116,117 aspects of cellular function, to see how these other layers compare with the biochemical one we analyzed here, with respect to the relationship of memories and integration.

It is, of course, essential to test these findings in real cells; technologies such as optogenetics and mesofluidics now exist that can provide the necessary level of temporal control118. Future work will also consider how causal emergence differs across other dimensions, such as the age of the host (i.e., GRNs in uteri, middle-aged adults, and elderly patients). An intriguing hypothesis is that cancer networks may react differently than tissue networks to training. Such insights would then inform biomedical control for the development of more efficacious drugs with fewer side effects. We have previously suggested that taking advantage of the decision-making, problem-solving, and other competencies of living material, such as by understanding how experiences affect its causality as an integrated whole distinct from the collection of parts, will lead to a much different therapeutic landscape6,88.

Alternative approaches to model causality exist, especially in the field of free energy minimization119. However, while useful for inferring interactions from data, they either rely on generative models and priors (e.g., dynamic causal models) or do not inherently address causal emergence but mainly represent statistical structure (e.g., Bayesian networks), further justifying the \(\Phi {ID}\) as the preferred measure of causal emergence. Still, studies on the interlink between integrated information and the free energy principle have improved our understanding of the human mind120 and virtual agents and environments121, and future work will fully merge causal emergence in GRNs with free energy minimization.

Three other areas offer immediate opportunities for further investigation. First is evolution. Significant work has looked at the interplay between intelligence and evolvability122,123,124,125,126,127,128,129. Simulations and in vitro experiments could now look at the effects of learning-induced potentiation of integrated agency on the evolutionary process. We hypothesize that such experiments could reveal bootstrapping and feedback loop dynamics in which causal emergence improves learning ability, which in turn increases causal emergence, leading to an intelligence ratchet that could potentiate the emergence of agency in the world. Second, the study of the phylogenetic and ontogenetic origins of Selfhood and integrated minds may be enriched by a better understanding of relevant dynamics not tied to a specific neural basis. Third, an expanded survey of training modes (besides associative conditioning), and subject matter (pathways, and other networks in biological and technological systems) should be examined for these dynamics.

Furthermore, these data may suggest that GRN modification (i.e., almost all pharmacological interventions) may need to be part of the discussions that regulate the tradeoff between the needs of human patients and those of biologicals used for research. This includes both those used for basic science experiments, and those that produce therapeutics—animals making antibodies and other drugs, humanized pigs and other sources of heterologous transplantation materials. In other words, to whatever extent integration metrics may relate to the existence of an inner perspective in a system57,63,130,131,132, the work described here provides an additional tool for discussions of ethics in the practice of biology and medicine.

Conclusions

Questions about the capabilities of living matter, and the applicability of tools from computational and cognitive sciences outside the neural domain, form the basis of an exciting emerging field that encompasses efforts to understand basal cognition, diverse intelligence, active matter, and unconventional computing15,102,133,134,135,136,137,138,139,140,141,142,143,144,145,146. We suspect it is likely that the ability to modulate integrated emergent agency by specific training experiences (that do not require physical rewiring) will have implications for not only evolutionary biology and philosophy of mind but also for engineering efforts involving biological, engineered, and hybrid agents.

Materials and methods

Biological models and simulation

We curated a dataset of 29 peer-reviewed biological network models from the BioModels open database70 encompassing all the strata of life (from bacteria to humans), the same adopted in ref. 87. Each model describes a GRN whose nodes are proteins, metabolites, or genes (the species of a network) and whose edges are mutual reactions. Each network is modeled over time according to chemical rate law ODEs, and so each one is a continuous-time dynamical system. Species take values on a continuous domain (like protein concentrations or gene expression levels). We relied on the SBMLtoODEjax Python library147, which parses System Biology Markup Language specification files into JAX programs. We then simulated each model in AutoDiscJAX (https://github.com/flowersteam/autodiscjax), which is not only written in the highly efficient and compute-optimized JAX language but also enables interventions on GRNs like applying stimuli. For all experiments, simulations were integrated with the fourth-order Runge-Kutta method using a step size of 0.01.

Random models

We built a set of 145 (5 different seeds per biological model) random networks by the gene circuit method71, as in ref. 87. For each biological network, we created five different random networks by sampling—with different random seeds—the parameters (i.e., connection strengths, initial concentrations, and constants) from a uniform distribution U(0, 1), where we set the bounds to be consistent with the empirical distribution of parameters in the biological models. So, we created a total of 145 random networks. These models had the same distribution of structural properties, in particular topology and network theoretic properties (in-degree, out-degree, betweenness centrality, PageRank, HITS scores) as the biological models they are built from, thus representing a class of synthetic biological networks with structured properties and random weights.

Memory evaluation

Of the types of memory identified in ref. 87, here we focused on associative memory, since it most emphasizes the need for a network to integrate experiences across different nodes, and because it is the most interesting clinically (offering the possibility of associating powerful but toxic drugs with “placebo” triggers)88. Associative memory is analogous to Pavlovian training in animals; we present an illustration in Fig. 1A, with the corresponding training schedule for our ODE GRNs in Fig. 1B. Associative memory involves a triplet of nodes (a circuit): a target R, an UCS that regulates R, and a NS that does not. We first “relaxed” the GRN to allow it to settle on a steady state and have a baseline for its pre-training behavior (relaxation phase) by simulating it for ts time steps without any stimuli. We then trained it by stimulating UCS and NS simultaneously (training phase) for another ts time steps. If, finally, stimulation of NS only regulated (testing phase) for other ts time steps, we said the network had “learned” to associate UCS with NS, turning NS into a conditioned stimulus CS. After preliminary experiments, we found ts = 250,000 (2500 s of simulated time) to be sufficient for all the networks to settle on steady values.

For each network, we tested every possible triplet of nodes for associative memory. Since biological entities take on continuous values in ODE networks, we can either up-stimulate (increase the value to some extent) or down-stimulate (decrease the value up to some extent) for a specific species. Similarly, stimulation can up-regulate or down-regulate R, depending on whether stimulation increases or decreases its value. Following87, we up-stimulated and down-stimulated by setting the quantity of a stimulus ST to emaxSTx100 and eminST/100, respectively, where emaxST and eminST were the maximum and the minimum values the species attained. We called R up-regulated if the mean value during testing was at least twice that during relaxation, and down-regulated if it was no more than half. In line with87, we found these values for stimulation and regulation to result in associative learning in our network and simulate the delivery of real-world drugs.

Stimulation through the delivery of real-world drugs did not take place in one persistent bout, but in several time-delayed dispensations. For this reason, we applied stimulation in pulses: we partitioned the phase (whether training or testing) into five equally sized time intervals and alternated between applying the stimulus (i.e., at the first, third, and fifth intervals) and not applying it. When a stimulus was not applied, the species followed their intrinsic dynamics as dictated by the network; in the Results, we verify that this pulsed stimulation was not correlated with causal emergence and, thus, could not alone explain the increase in emergence after training.

Of all the triplets of the 29 biological networks considered, the 808 that passed the pretest belonged to 19 networks (more than half), in line with ref. 87. We considered these circuits for all the analyses and visualizations.

Data preprocessing

We applied the following preprocessing steps from ref. 148 to highlight underlying structures in the GRN simulation data and allow for more meaningful inferences. For each simulated ODE trajectory, we performed global signal regression by regressing out the mean at each time step across the species to filter out global artifacts; these could be of biological significance, but, when computing information flows, we are interested in changes from the baseline and not global trends149. Second, we removed autocorrelation. Indeed, biological signals are known to be autocorrelated, which could inflate pairwise dependencies between time series by reducing the effective degrees of freedom150. We followed the approach of151 and performed the following steps independently for each species: we computed the linear least-squares regression between time t –1 and t, computed the predicted values at time given the regression results, and finally obtained the residuals as our preprocessed signal.

Information theory and partial information decomposition

Information theory, originally introduced to study the transmission capacity of communication channels, has over the years emerged as a principled language to evaluate dependencies in complex systems, including biological152. The basic object of study is Shannon’s entropy:

$$H\left(X\right)=-\sum _{x}p\left(x\right){\mathrm{ln}}p\left(x\right)$$

Where the summation is over the support of X, and it quantifies the amount of uncertainty about a random variable X. We can then define for a process consisting of a “source” variable X and a “target” variable Y the mutual information as the uncertainty that is left on Y after observing X, i.e., how much information observing X discloses about Y.

What if there is more than one source variable, as happens in complex systems like regulatory networks? For a finer-grained understanding of information, we must consider all the different ways information flows across a system. Intuitively, let us consider the case of stereoscopic vision in humans: with one eye open, we perceive a unique set of visual features for each, and also redundant features both of our eyes capture, but, finally, depth perception, which can only be captured if both eyes are open simultaneously, corresponds to synergistic information. The seminal work on Partial Information Decomposition (PID) provides a framework to parcel out mutual information into various information atoms (redundant, unique, and synergistic)153.

Mutual information and PID, by themselves, are instantaneous measures of integration; they fail to capture the temporal and causal aspects of information dependencies up to and including all future time steps, a crucial aspect for dynamical systems that evolve over time65, like our models of biological regulation.

Causal emergence and integrated information decomposition

We investigated how associative learning impacts the causal emergence of a system, meant as the ability of a whole to influence the future of its parts. Several operationalizations to quantify the integration of a system have blossomed, like the various \(\varPhi\) measures, with groundbreaking results in neuroscience154. Still, they are unidimensional and so tend to behave inconsistently155, like the famed Tononi’s \(\Phi\). We pursued a multidimensional decomposition and relied on the recent framework of Integrated Information Decomposition \(\varPhi {ID}\)64; it is a finer decomposition of the PID and captures all informational dependencies of a system in space (macro- and micro-scale) and time (instantaneous up to and including all future time steps). According to general assumptions outlined in ref. 66, a system’s capacity to display emergence depends on how much information the whole provides about the future evolution that cannot be inferred by any subset of parts. The \(\Phi {ID}\) formal apparatus then tells us that we can decompose emergence capacity as the sum of two terms:

1) Downward causation: the amount of information that the whole predicts about the future of the single components;

2) Causal decoupling: the amount of information that the whole predicts about the future of the whole.

This definition previously appeared to quantify the reduction in emergence capacity between healthy and brain-injured patients69; we chose it as our measure of causal emergence in biological systems since it considers all types of influences the whole can have on the future of a system. We present a schematic of how causal emergence is derived from GRN data in Fig. 8.

Fig. 8: From GRN data to causal emergence.
figure 8

This schematic shows our pipeline to arrive at the causal emergence values through the \(\Phi\)ID decomposition from GRN data. (1) We simulate a network to collect its gene expression profiles over time. (2) We preprocess the data with the techniques of126. (3) We compute the lag-1 mutual information matrix between the time series of every species in the network. (4) Using the results from the last step, we group the species into two partitions according to the minimum mutual information bipartition137. (5) Finally, we compute causal emergence as the sum of synergy (predictive power of the whole with respect to the whole) and downward causation (predictive power of the whole with respect to the individual parts) of the two partitions over time69; we remark that, as seen in this picture, causal emergence does not consider the predictive power of the individual parts with respect to the individual parts.

Gaussian information theory

Our models are continuous-valued biological dynamical systems, while information theory was originally defined for discrete random variables. We used the continuous generalization of Shannon’s entropy, the differential entropy:

$$H\left(X\right)=\mathop{\int }\limits_{x}p\left(x\right){\mathrm{ln}}p\left(x\right){dx}$$

Where the integral is over the support of X. This generalization is, in general, hard to compute because it requires estimating p(x). But, if we assume that \(p\left(x\right)\) is distributed as a Gaussian, we can leverage closed-form estimators for the entropy and, as a result, all the other information measures156. Indeed, the bivariate mutual information (in natural units) becomes:

$$I\left(X,Y\right)=-\frac{{\mathrm{ln}}\left(1-{\rho }^{2}\right)}{2}$$

Where \(\rho\) is the Pearson correlation coefficient between X and Y. While the Gaussian hypothesis is limiting in the general case, for our specific data we verified, through the Shapiro-Wilk test, that our preprocessed data were significantly described by the Gaussian distribution.

Most practical computations of causal emergence converge on the same simple form for Gaussian continuous variables that we adopted here. We first computed the lag-1 mutual information matrix among all pairs of nodes in the system using the equation above157. Since we cannot handle systems with several elements because of the combinatorics involved158, we reduced the dimensionality by the Minimum Information Bipartition (MIB)159. The MIB bisects the system into two components by approximating the bisection through the Fiedler vector (the eigenvector of the graph Laplacian corresponding to the smallest non-zero eigenvalue). After bisecting the graph with the Fiedler vector, we averaged within each component and compared the dynamics of the two parts to the whole. In essence, we sliced a watermelon along its longest axis and measured the extent to which the average number of seeds in one half predicted the average number of seeds in the other. Finally, we solved a linear system of equations relating the mutual information to the atoms the \(\Phi {ID}\) is decomposed in.

Behavior descriptors

We extracted seven descriptors from each causal emergence trajectory to be fed to an unsupervised learning pipeline for automatic classification into behaviors. The descriptors are:

  1. 1)

    Trend: the slope of the least squares fit of the trajectory. A positive slope indicates an increasing trend, while a negative slope indicates a decreasing trend.

  2. 2)

    Monotonicity: the Kendall’s tau coefficient between the trajectory and the sequence of its time stamps. Kendall’s tau is a standard statistic to measure ordinal association between two quantities, and in our case, it is the highest for a perfectly monotonically increasing trajectory, and the lowest for a perfectly monotonically decreasing one, with values around zero corresponding to the trajectory fluctuating independently of the time axis.

  3. 3)

    Flatness: how flat the trajectory is and does not locally deviate from the mean. We divided the trajectory into consecutive intervals and approximated the trajectory with the mean over each interval. We computed flatness as the r-squared coefficient of this approximation: the higher the coefficient, the better fit are the local means, meaning the trajectory was locally flat (though jumps may have existed in correspondence with the interval boundaries). After preliminary experiments, we found an interval size of 100 to correctly capture the intuition behind the flatness of a trajectory.

  4. 4)

    Number of peaks: the number of local minima and maxima of the trajectory. To detect the maxima of a trajectory, we searched the time step list for values that are higher than the values neighboring them, and, to exclude weak maxima, filtered out those that were not equal to the maximum over a centered window of size 100 to correctly capture the intuition behind a peak. To detect the local minima, we repeated the same procedure but for values that are smaller.

  5. 5)

    Average distance among peaks: the average distance among all the peaks from 4) (or 0 if there were none).

  6. 6)

    Average difference among peaks: the average difference in causal emergence value of the peaks from 4) (or 0 if there were none).

  7. 7)

    Range: the difference in causal emergence value between the maximum and the minimum peaks.

Statistics and reproducibility

General information on what statistical tests were carried out is made explicit in the text whenever a test is mentioned. For all tests, we used 0.05 as the confidence level and reported the sample sizes in the description of the corresponding experiment. When random data were simulated, five different random seeds were used to seed the replicates.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.