Abstract
Episodic memory involves remembering the what, when, and where components of an event. It has been observed in humans, other vertebrates, and the invertebrate cuttlefish. In clever behavioral experiments, cuttlefish have been shown to have episodic-like memory, where they demonstrate the ability to remember when and where a preferred food source will appear. The present work replicates this behavior with a parsimonious model of episodic memory. To further test this model and explore episodic-like memory, we introduce a predator-prey scenario in which the agent must remember what creatures (e.g. predator, desirable prey, or less desirable prey) appear at a given time and region of the model environment. This simulates similar situations that cuttlefish face in the wild. They will typically hide when predators are in the area, and hunt for prey when available. When the memory model is queried for an action (e.g., hunt or hide), the cuttlefish agent hunts for preferred food, like shrimp, when available, and hides at other times when a predator appears. When the memory model is queried for a place, the cuttlefish agent acts opportunistically, seeking less-preferred food (e.g., crabs) if it is located farther from a predator. These differences show how behavior can be altered depending on how memory is accessed. Querying the model over time might mimic mental time travel, a hallmark of episodic memory. Although developed with cuttlefish in mind, the model shares similarities with the hippocampal indexing theory and captures aspects of vertebrate episodic memory. This suggests that the underlying mechanisms supporting episodic-like behavior in the present model may be an example of convergent cognitive evolution.
Similar content being viewed by others
Introduction
Episodic memory involves the recollection of a personal experience. Recalling these experiences requires mental time travel to reflect on the past1. Humans and other animals have demonstrated episodic-like memory behaviorally by recalling the “what”, “when”, and “where” of past events2. Although episodic memory was first described in humans, we treat the “what”, “when”, and “where” framework as a general behavioral criterion rather than something specific to humans. Because directly querying declarative memory in non-human animals is challenging, clever behavioral experiments have been designed to demonstrate episodic memory in a variety of organisms2. For example, it has been shown that scrub jays remember when food items are stored by allowing them to recover preferred perishable and non-perishable food3. Scrub jays searched preferentially for fresh food if not much time passed. But if enough time had passed that the preferred food had decayed, they searched for the non-perishable food. These behavioral experiments demonstrated that corvids could recall the “what” (food type), “when” (short vs. long delay), and “where” (cache location) of their memory. Although birds have a hippocampus, the avian brain lacks a cortex and the highly processed multimodal hippocampal inputs that are observed in the mammalian brain2. This comparative perspective highlights that different species can solve similar memory demands through different neural architectures.
Cuttlefish, invertebrates in the same class of cephalopods as octopus, have shown episodic-like memory in experiments similar to those carried out with scrub jays4,5. In the first phase of these experiments, cuttlefish were trained to remember the locations of a preferred food (shrimp) and a non-preferred food (crabs). In the second phase of the experiment, cuttlefish learned that the preferred food was only available after a long delay, and were able to hold off feeding on the non-preferred food until the preferred food was presented. Like the scrub jays, cuttlefish recalled that shrimp (what), were available after a delay (when), at a specific location (where). It should be noted that similar experiments in the octopus yielded mixed results. In Poncet and colleagues’ experiments6, six out of seven octopuses relied on less-cognitively demanding strategies than keeping track of time when preferred food was available. It is not clear if this was due to experimental design or different ecological demands between these two organisms. Cuttlefish have also demonstrated the ability to track time in a delayed-gratification experiment7. Therefore, the present paper will focus on episodic-like memory in cuttlefish4.
Interestingly, cuttlefish, like other cephalopods, do not have a hippocampus, rather learning is thought to take place in their vertical lobe8,9. The anatomy of the vertical lobe has similarities to the hippocampus10,11. These similarities should be understood at the level of broad computational motifs rather than anatomical or evolutionary homology, as current cephalopod neurophysiology does not support mechanistic equivalence. There are strong fan-in signals from visual and chemotactile regions, and fan-out signals to motor areas. The neurons in this brain area show signs of long-term plasticity12. Such an architecture may support episodic-like memory, as suggested by both experimental and theoretical work. Here, we reference vertebrate memory systems only as well-studied examples of how conjunctive event information can be organized, rather than as biological templates that cephalopods must follow.
In this work, we draw on the hippocampal indexing theory, which was proposed by13,14 as a computational analogy for organizing “what”, “when”, and “where” information, without assuming that cuttlefish implement hippocampal-like mechanisms. The idea is that the mammalian hippocampus does not contain the memory itself, rather it has pointers to the cortex to form and retrieve memories. The multimodal information from the cortex is converted to an index that activates a set of neurons in the hippocampus. If a new memory is experienced, a new hippocampal index code is generated and the connections from the activated hippocampal neurons back to the cortical columns associated with the memory are strengthened. If a memory is to be recalled, a subset of cortical columns form an index in the hippocampus and the complete memory is read out. In support of this theory, it has been shown that hippocampal neurons in humans encode conjunctions of a memory15. Similarly, the vertical lobe could take multiple inputs encoding “what”, “when”, and “where” to encode the appropriate memory or action. In computer science, this is like a content-addressable memory or a database that can be queried16,17.
Inspired by the hippocampal indexing theory circuitry, we created a model to simulate the episodic-like memory experiments carried out in the cuttlefish. The memory model is a three-dimensional matrix, indexed by what, when, and where tuples, which can also be queried along individual dimensions. This memory architecture is sufficient to replicate cuttlefish episodic-like memory experiments in simulation. To further challenge the model, we created a predator-prey scenario in which the simulated cuttlefish agent had to remember what item (i.e., predator, crab, shrimp) would appear when, and where it could find prey and avoid predators. How the memory was queried shaped the agent’s behavior. Whereas asking what object can be found at the current time led to delaying hunting until predators were not present, asking where the agent should go at this time led to opportunistic hunting of a safer, non-preferred food. We go on to show that multiple queries forward and backward in time might support a form of mental time travel.
The main contributions of this episodic-like memory model are:
-
A parsimonious structure that can be queried across “what”, “when”, and “where” dimensions. Although the model is inspired by the hippocampal indexing theory, we suggest that the cuttlefish vertical lobe could support such an architecture.
-
Replication of episodic-like memory experiments in cuttlefish, as well as flexible behavior in a more complex predator-prey simulation.
-
Showing that agent behavior depends on how memory is accessed through these queries.
-
A potential form of mental time travel with relatively simple extensions.
These findings show how a simple architecture might support episodic-like memory and may suggest what computations are necessary to support such a memory systems. It has interesting implications on how such memory shapes behavior and decision-making. We discuss how this could be expanded in future iterations.
Materials and methods
Two episodic memory scenarios were developed. The first modeled experiments that demonstrated episodic-like memory in cuttlefish4. The second involved predator and prey, and was based on cuttlefish behavior in the wild. Like18, the agent had to distinguish between predators and prey, and then take appropriate action. The present scenario requires memory for when and where these events occur. The source code for both these simulations is written in Python and publicly available at: https://github.com/jkrichma/EpisodicLikeMemoryModel.git
Episodic memory model
At the core of both scenarios is a three-dimensional memory model that is indexed by “what”, “when”, or “where”, as well as combinations of these indices (Figure 1). The structure holds the expected values of these indices. For example, a query of “what[shrimp]” and “when[hour 3]” would return the expected values of shrimp at the 3rd hour over all locations in the environment (see Equation 1).
The result from this query is the vector \(\vec {v}\) that contains the expected value of shrimp at all the “where” locations in memory. This vector can then be used by a reinforcement learning algorithm to select the appropriate action (e.g., hunting or roaming), with an exact action set depending on the scenario described below.
We use the common delta rule to learn associations of what, when, and where. More sophisticated learning rules could be applied but the delta rule will suffice for these simulations to demonstrate episodic-like memory. The learning rule is given by Equation 2.
where \(\alpha\) is the learning rate and r is the reward or penalty incurred.
For example, if a shrimp was found and eaten at the 3rd hour at location 42 and received a reward of 4 points with learning rate of 0.10, then Equation 3 would be:
Once a value vector \(\vec {v}\) is acquired for a given action, we subject it to the Softmax function. In the case of equation 1, the vector would be a list of expected values at each location and the action act given by Equation 4 would be go to a specific location.
Combinations of memory queries can be used to obtain expected values and agent actions. For example, one could query “what” and “when” for each object (e.g., crab, shrimp, predator), then subject these values to equation 4, which may choose hunt shrimp as the best action at this time. Then an additional query of “shrimp” and “current time” would give the expected location of shrimp at the present time.
Simulation environments
We describe two different scenario environments to test the episodic memory model. In both cases, we used a grid world containing a cuttlefish agent, shrimp, crabs, predators. Predators were assumed to detect the cuttlefish at a greater distance, while the cuttlefish’s vision allowed it to only perceive nearby objects.
Episodic like memory scenario
In4, episodic memory in cuttlefish was investigated in a two phase experiment. In the first phase, the cuttlefish learned where crab and shrimp were located. In the second phase, the cuttlefish learned that shrimp, which is a preferred food, was only available after a 3 hour delay. To simulate this experiment, an environment was created with two objects, a crab and a shrimp, a 3 hour duration, and an 8x8 grid world environment (see Figure 2). This created an episodic memory that was initialized to:
Episodic-Like Memory Environment. At the start of each hour, the cuttlefish is placed in the middle of the left side of the environment (x=0, y=4), and a crab is placed in the upper right region (x=7, y=0). The crab is available every hour. The shrimp’s location is in the bottom right region (x=7,y=7). During phase 1, the shrimp is available every hour. During phase 2, the shrimp is only available during hour 3. The cuttlefish can move freely within the environment, shrimp and crab remain stationary.
As in4, both the crab and shrimp are available during all 3 hours in the first phase of the experiment. If the agent reaches one of these food items, the appropriate reward (see Table 1) is applied to the memory as given by Equation 2. In the second phase, the crab is always available, but the shrimp location is only rewarded after 3 hours.
The simulation ran for 100 days, where each day lasted 3 hours, and an hour lasted 100 simulation time steps. The three actions were: hunt crab, hunt shrimp, and roam. The value for roam was set to 0.5, the other action values were learned by the model (see Equation 2). At the start of every hour: 1) Values were acquired with a query for each object (crab, shrimp) at the current time (see Equation 1). For each object, the maximum value across all locations was chosen. 2) An action was chosen with the Softmax function (see Equation 4). The average value or a sum of values could be used, as will be seen in the other scenario. 3) If the action is to hunt, an additional query then retrieves the location where the chosen object has its maximum value. It is assumed that the cuttlefish agent knows how to reach this location. 4) If the action is to roam, then the cuttlefish agent moves randomly a step in one of 8 cardinal directions. 5) For all 3 actions, if a shrimp or crab is available at this time, and the cuttlefish agent is less than 2 steps from the object, the value is updated by Equation 2. The parameters for this scenario are given in Table 1.
Predator-prey scenario
To further test our episodic memory model, we created a predator-prey scenario in which the cuttlefish agent learned when and where shrimp and crab were likely to be located, but also needed to learn when and where a predator could be located. To simulate this experiment, an environment was created with three objects, a crab, a shrimp, and a predator, a 6 hour duration, and a 12x12 grid (see Figure 3). This created an episodic memory that was initialized to:
Predator-Prey Environment. The simulation environment is divided into 9 regions. Every hour, the cuttlefish is placed at the top (x=6, y=0) of region 1 and can move freely between regions. Shrimp are available from hour 2 to hour 5 in region 6. Crabs are available all day (hour 0 to hour 5) in region 8. The predator is in the environment from hour 1 to hour 3. Every hour when available, the shrimp and crab are placed anywhere in their respective regions and stay stationary. Every hour when present, the predator is placed somewhere in the left side (X < 6) of the environment. The predator either moves randomly or approaches a cuttlefish if it is within the predator’s vision.
The simulation ran for 200 days, where each day lasted 6 hours, and an hour lasted 100 simulation time steps. The four actions were: hunt crab, hunt shrimp, hide, and roam. The value for roam was set to 0.5, the other action values were learned by the model (see Equation 2). At the start of each hour, objects were placed in the appropriate regions of the environment (see Figure 3). If the action was hunt, the cuttlefish agent headed to the center of region where the prey was most likely to be found. Once in the region, the agent randomly roamed the region until a prey was found, whereby it receives the appropriate reward, or the hour had passed. If the action was hide, the agent simulated camouflaging. That is, the cuttlefish was invisible to the predator. The predator randomly roamed the environment until it saw the cuttlefish was within a distance of 4. In that case, the predator would head directly toward the cuttlefish.
Two experiments were carried out: (1) The memory was queried based on each object’s expected value (“What”). (2) The memory was queried based on each region’s expected value (“Where”). The parameters for this scenario are given in Table 2.
What Query Experiment. At the start of every hour: (1) Values were acquired with a “what” query for each object (crab, shrimp) at the current time (see Equation 1). In this experiment, the total value was summed over all locations for each object. (2) An action was chosen with the Softmax function (see Equation 4). (3) If the action was to hunt, an additional query then retrieves the region where the chosen object has its maximum value. The agent then proceeded to the center of the selected region. It is assumed that the cuttlefish agent knows how to reach this location. (4) If the action is to roam, then the cuttlefish agent moves a step in one of 8 cardinal directions. The direction is chosen randomly. (5) For all 3 actions except hide, if a shrimp or crab are available at this time, and the cuttlefish agent is less than 2 steps from the object, the value is updated by Equation 2.
Where Query Experiment. At the start of every hour: (1) Values were acquired with a “where” query for each region (see Equation 1). In this experiment, the total value was summed over all locations within each region. (2) An action for going to a region was chosen with the Softmax function (see Equation 4). (3) The agent heads from its initial position to the center of the chosen region. (4) If the action was to roam, then the cuttlefish agent moves randomly a step in one of 8 cardinal directions. The direction is chosen randomly. (4) For all actions except hide, if a shrimp or crab are available at this time, and the cuttlefish agent is less than 2 steps from the object, the value is updated by Equation 2.
Results
Episodic like memory simulations
Episodic-like memory was shown in studies in which the cuttlefish had to remember where crab and shrimp were located, and when the preferred food (i.e., shrimp) was available4,5. To show how the present episodic memory model could support such behavior, the basic idea of the study was replicated (see Figure 2). The cuttlefish agent roamed its environment until it either found a prey or 100 time steps, which corresponded to an hour, had passed. The simulation lasted 100 days and there were 3 hours in each day. Because the cuttlefish agents made random movements, the simulation was run 100 times. In the first phase of the experiment, both crab and shrimp are available all day (3 hours). After 50 days, phase 2 begins where the crab is still available all 3 hours, but the shrimp is only available during the 3rd hour.
The cuttlefish agents choices in both phases were consistent with those reported in4. In phase 1, the cuttlefish agent showed a clear preference for shrimp (see Figure 4A.). In phase 2, the cuttlefish agent went to the crab location when there were 1 hour delays, and chose to forage in the shrimp location when there were 3 hour delays (see Figure 4B). These results show that the present memory model can support the acquisition and recall of “what”, “when”, and “where” information that is a feature of episodic-like memory.
Episodic Like Memory. The percentage of choices after different delays are shown in the boxplots. The last 10 choices in each phase are shown for the 100 simulation runs. (A. Phase 1.) Crab and shrimp are available every hour. Phase 1 occurred during the first 50 days. (B. Phase 2.) Crab are available after 1 hour delays. Shrimp are available after 3 hour delays. Phase 2 occurred during the last 50 days. The red line in the boxplot is the median of the 100 simulation runs, the box extends from the first quartile to the third quartile, the whiskers extend 1.5x beyond the quartiles. Circles denote outliers. Note that in phase 2, the median for crabs(3hr) is zero.
Predator-prey simulations
The Predator-Prey simulations further tested the episodic memory model by introducing a predator that appears a few hours each simulation day. The agent must not only remember when and where prey types are available, but also take into account when and where a predator might show up. As in the previous experiment, shrimp were preferred over crabs. Shrimp availability has more overlap with when and where the predator appears (see Figure 3). Note that the agent can only eat one prey per hour, or be eaten once per hour. After any of these events occur, the agent takes no further action until the next hour. Because the cuttlefish and predator agents have randomness to their movements, the simulations were run 100 times.
What-when query experiments
In these experiments, the memory is first queried to get the expected values of the predator and each prey at the current hour. These values are turned into an action vector to roam, hunt crab, hunt shrimp, or hide. Figure 5 shows how the cuttlefish agent learned over time to avoid the predator, and eat the preferred prey. By the 100th day, agent ate more shrimp than crabs and was rarely caught by the predator.
An examination of the actions taken per hour revealed that the cuttlefish agent learned to opportunistically hunt crab and shrimp during the hours that the environment was predator free. In the first 20 days, the cuttlefish agent roamed often, which increased predation risk but also facilitated exploration of the environment (see Figure 6A). By the last 20 days, the cuttlefish agent learned what objects to seek, when to seek them, and where to find them (see Figure 6B).
Actions per Hour with “What-When” queries. The roam, hunt or hide action that the agent chose per hour is shown for early and late trials. (A.) Actions chosen on the first 20 days. (B.) Actions chosen on the last 20 days. Bars denote the mean actions taken per 100 runs and the error bars denote the standard deviation.
When-where query experiments
In these experiments, the memory was queried to get the expected values of each region at the current hour. These values were turned into an action vector to move towards a region in the environment. Figure 7 shows how the cuttlefish agent learned over time. In contrast to the “What-When” query experiments, more crabs are eaten than shrimp. Although the cuttlefish agent learned to somewhat avoid the predator, it was still occasionally eaten. The reason for this is that the path taken to the crab was, for the most part, predator-free. It appears that the agent opportunistically hunted crab which was a safer choice.
An examination of the actions taken per hour in these experiments revealed that the cuttlefish agent learned the environment in the first 20 days by exploring all regions (see Figure 8A). By the last 20 days, the cuttlefish only explored the two regions where prey are found (see Regions 6 and 8 in Figure 8B). In region 6, where shrimp are found, the cuttlefish agent primarily went there during hours 5 and 6 when it was predator-free. Yet it still occasionally looked for shrimp in hours 2 and 3 when shrimp were available but the predator was present. During hours 0 through 4, the cuttlefish agent opportunistically sought crab in region 8, even during those times when the predator was present. However, when shrimp were available risk-free (hours 4 and 5), the cuttlefish agent only explored region 6.
Actions per Hour with “When-Where” queries. The region that the agent chose per hour is shown for early and late trials. (A.) Actions chosen on the first 20 days. (B.) Actions chosen on the last 20 days. Bars denote the mean actions taken per 100 runs and the error bars denote the standard deviation.
Mental time travel through memory queries
Mental time travel is the ability to reason through time, and it is a hallmark of episodic memory. The present episodic memory model has sufficient information to show this capability. Because queries can reconstruct past or anticipate future states, the model supports a rudimentary form of mental time travel. For instance, similar to animal experiments of episodic-like memory, the behavior shown in the above experiments hints at mental time travel being used, but lacks the declarative report.
Since we have access to the cuttlefish agent’s memory matrix, we can query the memory after its experience in the environment. Figure 9 shows the results of such queries at different times in the experiment. To create these charts, we made “what”, “when”, and “where” queries to the memory after a typical simulation run. If the agent started at hour 1, one could imagine traveling forward in time until hours 4 and 5 where the shrimp could be eaten without the risk of encountering a predator. Or traveling back in time to hour 0 for a risk-free crab breakfast.
What, When, Where queries to memory. The scatter plots show the expected values of each object, at each location in the environment. The panels from left to right show the result at each hour of the simulation day. The magenta markers denote shrimp encounters, the red markers denote crab encounters, and the black markers denote predator encounters. The size of the marker is proportional to the expected value. The charts show the result after a typical simulation run. Top. Result of the “What-When” query experiments. Bottom. Result of the “When-Where” query experiments.
Figure 9 also illustrates how the type of query can shape the memory. The top chart in Figure 9 shows the result of the “What-When” query experiment. In this case, the agent took less risks by hunting prey when the predator was not present. The bottom chart of Figure 9 shows that “When-Where” queries led to opportunistic but risky crab hunting in hours 1 through 3. Note the larger expected value of the predator and the locations of the predator cover more environment. This is because the cuttlefish agent was not hidden (i.e., camouflaged) when hunting crab and may have encountered a predator.
Discussion
Episodic-like memory allows animals to contextualize past events in order to guide future actions. An episodic memory contains what happened, when it happened, and where it happened. This conjunction of what, when, and where has now been observed in a wide range of organisms including primates, birds, and cephalopods2. Our simulations therefore speak to the behavioral expression of episodic-like memory, rather than to the neural mechanisms that may underlie it any particular species. It has been argued that only humans have true episodic memory, where they can carry out mental time travel and reflect on their past. However, it is an open question whether this is because we share a common language with humans that allows us to report and probe memories in ways that we cannot with other animals. There is growing consensus that many animals have “dimensions of consciousness”, which not only include such self-reflection, but also other cognitive attributes19.
In the present work, we introduced a parsimonious memory model that can replicate the episodic-like memory shown in cuttlefish behavioral experiments4,5, as well as other scenarios requiring “what”, “when”, and “where” conjunctions to recall episodic events. The model takes inspiration from the hippocampal indexing theory13,14 in that combinations of “what”, “when”, and “where” indices can be used to store and recall memories. The cuttlefish’s vertical lobe, which is important for learning and memory, has anatomical similarities to the hippocampus10,12. Since there are limited studies of episodic-like memory in the cuttlefish4 and inconclusive results in the octopus6, we hope the present work inspires future experimental and modeling studies.
In simulations, our episodic memory model: (1) Replicated episodic-like memory experiments in cuttlefish4. The simulated cuttlefish agent remembered that preferred food could be found at a certain location after longer time delays than a non-preferred food. (2) Demonstrated episodic-like memory in more complex simulations that required flexible behavior. A predator-prey scenario required the cuttlefish agent to remember not only when and where preferred and non-preferred food could be found, but also had to remember when and where a predator might be encountered. The agent successfully caught preferred food when the predator was not present, and opportunistically caught non-preferred food by avoiding the predator’s typical locations. 3) Showed that how the memory was queried could affect the agent’s behavior. If the query was for what objects could be found when, the agent demonstrated risk-averse hunting. If the query was for when and where objects could be found, the agent opportunistically hunted prey while risking encounters with the predator. In both cases, the cuttlefish agent successfully learned to avoid predators and hunt prey. 4) Demonstrated the ability to travel forward and backward through time. Queries of the memory showed the expected value landscape over time. Although not shown here, a reasoning algorithm could autonomously play out imagined scenarios to choose the appropriate time and place to act.
Model assumptions
The memory model presented here was purposefully made simple to show how such a memory structure could support episodic-like memory. Therefore, we assumed the agent had a priori abilities that were not necessarily part of the episodic memory, but required for the agent to carry out its behavior.
We assumed that once the agent remembered an object’s location, it knew how to get near there. We loosened this assumption by dividing the environment into regions and once in a region the agent had to search for its prey. If path planning is an important feature of future versions, then the present model could readily incorporate other sequence learning algorithms to plan paths to locations, such as reinforcement learning, successor representations or recurrent neural networks20,21,22,23.
Another assumption was that the agent had a sense of time passage. Although it is clear in humans and other organisms that they have a sense of time at multiple timescales, it is not agreed upon how the brain implements an internal clock. Time cells have been reported in the rodent and human hippocampus24,25. Others have suggested that the basal ganglia and its interactions with the cortex can keep track of time26. For now, we assume that there is a signal that denotes the passage of time and how that timekeeper is implemented is an open issue.
It has been proposed that neurogenesis in the dentate gyrus, which is a hippocampal subfield, could register memories in time27,28. The “when” component of an episodic memory could be encoded by a neuron’s birth date. Volumetric estimates of the cuttlefish vertical lobe size show rapid growth during development29. However, whether this neurogenesis encodes time is speculation and electrophysiological experiments will need to be carried out to verify this prediction.
Mental time travel
The simulations showed that the memory model could be queried forward or backward in time. Using this information would allow the agent to observe the memory landscape and decide to act now or delay action. Mental time travel, which involves imagining past memories and looking to the future, is a hallmark of episodic memory. While it cannot be said that the present model has this capability, it could be expanded to support this idea of mental time travel. It would require the addition of reflection and reasoning over the stored memories. In30, this was achieved by implementing a large language model with a memory store and a reflection module. Such additional modules could be integrated with the present episodic memory structure.
Hippocampal indexing theory and the cuttlefish vertical lobe
A major goal of the present work was to show episodic-like memory in a simple, yet biologically plausible data structure. The mammalian hippocampus, which appears to be necessary for episodic memory, receives processed sensory input from numerous brain regions31,32,33. Evidence suggests that the hippocampus accesses memories with conjunctions of these inputs15,34. Thus, the choice of a three-dimensional matrix with dimensions denoting the what, when and where aspects of a memory, although overly simple, may be justified. There have been recent discussions whether the hippocampus uses an indexing code15,34,35, or concept neurons36 The present model suggests that the architecture of the vertical lobe could support an index code, but neurophysiological experiments will need to conducted to verify this prediction.
Our model of episodic-like memory takes inspiration from the Hippocampal Memory Indexing Theory, which was originally proposed by Teyler and DiScenna to account for amnesia patients, hippocampal plasticity, and the neuroanatomical interaction between the hippocampus and neocortex in the mammalian brain13,14. Like their theory, our memory model structure has a memory formation stage, and a memory retrieval stage. In the memory formation stage, the “what”, “when”, and “where” attributes of an event form a unique index into the memory structure. During retrieval, the memory structure can be queried and accessed with one or more of these attributes. The attributes can be thought of as homologous to inputs from different parts of the cortex, and the memory structure can be though of as homologous to the hippocampus. Similar to the theory, the index carried by our memory structure points to the memory attributes. A major difference in the present work is that the memory being stored and retrieved is the expected value of the attributes. However, this memory location could readily be used to point back to those attributes and recall the completed memory.
It has been suggested that the cephalopod has anatomical similarities to the mammalian hippocampus 10,11. Like subfields in the hippocampus, there is multimodal fan-out to the vertical lobe amacrine neurons, and fan-in to large efferent neurons that project to brain regions dictating behavior. Furthermore, like the hippocampus, the cephalopod vertical lobe is known as a site of learning and memory 12. Shomrat and colleagues demonstrated long-term potentiation in both the octopus and the cuttlefish 12. They showed that the vertical lobe in both organisms had the same fan-out and fan-in architecture. However, the site of LTP in octopus was the glutamatergic fan-out neurons, and in the cuttlefish the site of LTP was in the fan-in cholinergic neurons. We suggest that this fan-out input from the cephalopod frontal lobes to amacrine neurons might be analogous to neocortical index into hippocampal subfields and the fan-in to the large efferent neurons in the vertical lobe are similar to the output of the hippocampus to cortical and subcortical targets dictating behavioral decisions. Although the present episodic-like memory model is an abstraction of these complex anatomical structures, our model does suggest that this computational framework could support cephalopod memory.
Comparison to other models
The hippocampus and the surrounding medial temporal lobe have inspired numerous memory models. These models typically focus on the spatial navigation that is a major cornerstone of hippocampal research20,22,37. Hippocampus inspired models such as the Byrne, Becker, Burgess (BBB)38, clone-structured cognitive graph (CSCG)37 and the Tolman-Eichenbaum Machine (TEM)39 support memory with different relational structures. BBB suggests that the parietal and retrosplenial cortices support transformations between egocentric and allocentric representations. It uses the head direction system to rotate these representations and the hippocampus to access memories38. CSCG uses graphical models to represent memory by creating different clones of observations for different contexts to resolve spatial ambiguity37. CSCG focuses on sequential learning with a graphical model that ties together observations with hidden states. When a new memory is acquired or an established memory deviates, a clone of the structure is created. These may be linked together through the graph structure. TEM proposes that entorhinal cortex cells form a structural knowledge basis while hippocampal cells link this basis with sensory representations, allowing generalization across environments with similar structures. The TEM focuses on relationships that form memories. In their work, they use spatial navigation, family relations, and semantic relations as examples. These models are specific to the mammalian hippocampus and may not be applicable to other biological memory structures like the cuttlefish vertical lobe. Moreover, these models require complex computations and data structures to perform memory functions. They also don’t address the conjunctions, especially when events occurred, necessary for episodic memory.
A recent computational model of episodic-like memory in food-caching birds has similarities to the present model40. Their model was able to replicate several landmark experiments with scrub jays that demonstrated memory of food type (what), cache location (where), and how much time had elapsed since the food had been cached (where). Similar to the present model, their model had an associative memory and actions were selected using a model-free reinforcement learning framework. Also, the scrub jay experimental paradigm3 is similar to the cuttlefish paradigm4, which was replicated in the first set of experiments . However, the memory structure in the present model differs from their associative memory in that it specifically addresses what, when, where tuples and can be queried. As discussed in the previous section , the memory model structure makes predictions on how the vertical lobe might operate similarly to the hippocampus. Because of these architecture decisions, the present model has very few open parameters. Also, the introduction of agent movement in the simulations, particularly in the predator-prey scenario, further challenged the model and led to interesting findings.
The present memory model has similarities to content addressable memories like Hopfield networks16. In a Hopfield net, a complete memory, which is stored in a matrix-like structure, can be retrieved with partial cues. Modern versions of Hopfield networks can create and access relational memories with queries17,41. These network models are loosely based on biological memory structures. Like other models discussed, they are not focusing on conjunctions of what, when, and where. They have mostly been applied to classification problems. It might be interesting to see if they could also support episodic-like memories.
Model extensions
A shortcoming of the present model is that it doesn’t scale and it is only accessed across the three dimensions. The scaling issue could be addressed with sparse coding and quantized query keys as in41. The original hippocampal indexing theory suggested that a partial index into the hippocampus could yield index pointers to multiple cortical areas to bring forth a complete memory13. In computer memory circuitry, content addressable storage systems use a hash code and other methods to scale up these memory systems42. Such structures could readily be integrated into the present episodic memory model.
The behavior and environment in the present simulations are highly constrained. This was purposely done to replicate controlled experiments and to show how the present memory structure could support behavior in a more challenging scenario. As discussed, the present model could be extended to support path planning and a more plausible representation of time, which would make it’s behavior more complete.
In addition, incorporating an energy or satiety variable could be a useful extension. For example, motivational control was a key element in the Brea et al. model40. Close examination of the present model and simulations suggest that motivation signals would not significantly alter the results. Behavior was dictated by the relative strengths of the expected rewards and penalties. With the current model, the agent was driven to safer options due to the strong penalty for predator encounters. The larger expected value of shrimp than crab led to occasional risky hunting behavior of shrimp in the presence of predators. These value differences overshadowed the need to incorporate motivational signals like energy expenditure. It could suggest that episodic-like memory might lead to planning that alleviates the concern of motivational reactions. However, including an explicit energy or fear cost signal might be useful to add in future versions on the model. Such signals could alter behavior in longer time frame scenarios. For example, a satiety signal would allow the model to reflect how internal states shape foraging behavior in these extended scenarios, such as the behaviour observed in delayed gratification experiments7.
The episodic memory model introduced here is a part of an ongoing project called CuttleBot. One of the first outcomes of this project was a biomemetic robot18. The highly constrained environment issue might be addressed if this memory model was embodied in an autonomous robot with active sensing. The field of neurorobotics connects the brain, body and environment43. The three main design principles for neurorobots are44: (1) they must react quickly and appropriately to events, (2) they must have the ability to learn and remember over their lifetimes, (3) they must weigh options that are crucial for survival. It is believed that following these design principles makes the robot’s behavior more realistic and successful. The latter two principles are addressed in this paper. The present model supports learning and memory, and the agent must weigh the tradeoff between foraging for food and becoming food itself. However, the first principle is grounded in the idea of embodied cognition, that is, body morphology and interaction with the real-world shapes behavior that responds quickly to events. By incorporating the present work into a robot, all three principles could be addressed.
Conclusion
In summary, we introduce a memory model that supports conjunctions of “what”, “when”, and “where” that are necessary for episodic-like memory and how this could affect behavioral decisions. The data structure can be queried by any combination of these three dimensions, enabling recall of events in ways that parallel behavioral findings in cuttlefish. With some extensions it may also support mental time travel, which is thought to be a requirement for episodic memory. This suggests that such a structure could support the computations necessary for the episodic-like memory observed across species. In David Marr’s famous levels of analysis45, he proposed that information processing systems could be understood at three levels: (1) Computation, (2) Algorithmic, and (3) Implementation. Using simulations, we designed computational tasks that required episodic-like memory to successfully solve. We proposed an algorithm, based on a structure that could support conjunctive coding of what, when, and where events. More neural recordings, like10,12, will be necessary to demonstrate if episodic memory is implemented with such an algorithmic structure.
Data availability
The source code for these simulations is written in Python and publicly available at: https://github.com/jkrichma/EpisodicLikeMemoryModel.git
References
Tulving, E. Episodic memory: From mind to brain. Annual Rev. Psychol. 53, 1–25 (2002).
Davies, J. R. & Clayton, N. S. Is episodic-like memory like episodic memory?. Philosophical Trans. R. Soc. B: Biol. Sci. 379, 20230397. https://doi.org/10.1098/rstb.2023.0397 (2024).
Clayton, N. S. & Dickinson, A. Episodic-like memory during cache recovery by scrub jays. Nature 395, 272–274 (1998).
Jozet-Alves, C., Bertin, M. & Clayton, N. S. Evidence of episodic-like memory in cuttlefish. Curr. Biol. 23, R1033-5 (2013).
Schnell, A. K., Clayton, N. S., Hanlon, R. T. & Jozet-Alves, C. Episodic-like memory is preserved with age in cuttlefish. Proc. R. Soc. B: Biol. Sci. 288, 20211052. https://doi.org/10.1098/rspb.2021.1052 (2021).
Poncet, L., Desnous, C., Bellanger, C. & Jozet-Alves, C. Unruly octopuses are the rule: Octopus vulgaris use multiple and individually variable strategies in an episodic-like memory task. J. Exp. Biol. 225, jeb244234 (2022).
Schnell, A. K., Boeckle, M., Rivera, M., Clayton, N. S. & Hanlon, R. T. Cuttlefish exert self-control in a delay of gratification task. Proc. Ro. Soc. B: Biol. Sci. 288, 20203161 (2021).
Hanlon, R. T. & Messenger, J. B. Cephalopod Behaviour (Cambridge University Press, 2018), second edn.
Nixon, M. & Young, J. Z. The Brains and Lives of Cephalopods (Oxford University Press, 2003).
Shomrat, T., Turchetti-Maia, A. L., Stern-Mentch, N., Basil, J. A. & Hochner, B. The vertical lobe of cephalopods: an attractive brain structure for understanding the evolution of advanced learning and memory systems. J. Comp. Physiol. A Neuroethol. Sens. Neural. Behav. Physiol. 201, 947–56 (2015).
Young, J. Z. Computation in the learning system of cephalopods. Biol. Bull. 180, 200–208 (1991).
Shomrat, T. et al. Alternative sites of synaptic plasticity in two homologous “fan-out fan-in’’ learning and memory networks. Curr. Biol. 21, 1773–82 (2011).
Teyler, T. J. & DiScenna, P. The hippocampal memory indexing theory. Behavioral Neurosci. 100, 147–154 (1986).
Teyler, T. J. & Rudy, J. W. The hippocampal indexing theory and episodic memory: Updating the index. Hippocampus 17, 1158–1169 (2007).
Kolibius, L. D. et al. Hippocampal neurons code individual episodic memories in humans. Nat. Hum. Behaviour 7, 1968–1979 (2023).
Hopfield, J. J. & Tank, D. W. “neural’’ computation of decisions in optimization problems. Biol. Cybern. 52, 141–152 (1985).
Krotov, D. & Hopfield, J. J. Dense associative memory for pattern recognition. In Lee, D., Sugiyama, M., Luxburg, U., Guyon, I. & Garnett, R. (eds.) Advances in Neural Information Processing Systems, 29 (Curran Associates, Inc., 2016).
Pfeiffer, M. A. et al. Cuttlebot: Emulating cuttlefish behavior and intelligence in a novel robot design. In Brock, O. & Krichmar, J. (eds.) From Animals to Animats 17, 93–105 (Springer Nature Switzerland, 2025).
Birch, J., Schnell, A. K. & Clayton, N. S. Dimensions of animal consciousness. Trends Cognit. Sci. 24, 789–801 (2020).
Banino, A. et al. Vector-based navigation using grid-like representations in artificial agents. Nature 557, 429–433 (2018).
Espino, H., Bain, R. & Krichmar, J. L. A rapid adapting and continual learning spiking neural network path planning algorithm for mobile robots. IEEE Robot. Autom. Lett. 9, 9542–9549 (2024).
Foster, D. J., Morris, R. G. & Dayan, P. A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus 10, 1–16 (2000).
Stachenfeld, K. L., Botvinick, M. M. & Gershman, S. J. The hippocampus as a predictive map. Nat. Neurosci. 20, 1643–1653 (2017).
Eichenbaum, H. Time cells in the hippocampus: a new dimension for mapping memories. Nat. Rev. Neurosci. 15, 732–744. (2014).https://doi.org/10.1038/nrn3827https://www.nature.com/453 articles/nrn3827.pdf.
Umbach, G. et al. Time cells in the human hippocampus and entorhinal cortex support episodic memory. Proc. Nat. Acad. Sci. 117, 28463–28474 (2020).
Matell, M. S. & Meck, W. H. Cortico-striatal circuits and interval timing: coincidence detection of oscillatory processes. Cognit. Brain Res. 21, 139–170 (2004).
Aimone, J. B. Computational modeling of adult neurogenesis. Cold Spring Harb. Perspect Biol. 8, a018960 (2016).
Aimone, J. B., Deng, W. & Gage, F. H. Adult neurogenesis: integrating theories and separating functions. Trends Cogn. Sci. 14, 325–37 (2010).
Chung, W.-S., López-Galán, A., Kurniawan, N. D. & Marshall, N. J. The brain structure and the neural network features of the diurnal cuttlefish sepia plangon. iScience 26, 105846 (2023).
Park, J. S. et al. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, UIST ’23 (Association for Computing Machinery, New York, NY, USA, 2023). https://doi.org/10.1145/3586183.3606763.
Rolls, E. T. & Treves, A. A theory of hippocampal function: New developments. Prog. Neurobiol. 238, 102636 (2024).
Sanchez-Aguilera, A. et al. An update to hippocampome.org by integrating single-cell phenotypes with circuit function in vivo. PLoS Biol. 19, e3001213 (2021).
Treves, A. & Rolls, E. T. Computational constraints suggest the need for two distinct input systems to the hippocampal ca3 network. Hippocampus 2, 189–99 (1992).
Kolibius, L. D., Josselyn, S. A. & Hanslmayr, S. And yet, the hippocampus codes conjunctively. Trends Cognit. Sci. 29, 689–690. https://doi.org/10.1016/j.tics.2025.06.013 (2025).
Kolibius, L. D., Josselyn, S. A. & Hanslmayr, S. On the origin of memory neurons in the human hippocampus. Trends Cognit. Sci. 29, 421–433 (2025).
Quian Quiroga, R. Conjunctive or context-invariant coding in the human hippocampus?. Trends in Cognitive Sciences 29, 687–688. https://doi.org/10.1016/j.tics.2025.05.006 (2025).
George, D. et al. Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps. Nat. Commun. 12, 2392 (2021).
Byrne, P., Becker, S. & Burgess, N. Remembering the past and imagining the future: a neural model of spatial memory and imagery. Psychol. Rev. 114, 340 (2007).
Whittington, J. C. et al. The tolman-eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell 183, 1249–1263 (2020).
Brea, J., Clayton, N. S. & Gerstner, W. Computational models of episodic-like memory in food-caching birds. Nat. Commun. 14, 2979 (2023).
Alonso, N. & Krichmar, J. L. A sparse quantized hopfield network for online-continual memory. Nat. Commun. 15, 3722 (2024).
Molom-Ochir, T., Taylor, B., Li, H. & Chen, Y. R. Advancements in content-addressable memory (cam) circuits: State-of-the-art, applications, and future directions in the ai domain. Ieee Trans. Circuits Syst. I-Regular Papers 72, 3971–3982 (2025).
Hwu, T. & Krichmar, J. Neurorobotics: Connecting the Brain, Body and Environment (MIT Press, Cambridge, MA, 2022).
Krichmar, J. L. & Hwu, T. J. Design principles for neurorobotics. Front. Neurorobot. 16 (2022).
Marr, D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (W.H. Freeman, San Francisco, CA, 1982).
Acknowledgements
The authors would like to thank members of the CuttleBot team for many valuable discussions. The authors would also like to thank Professor Nicola Clayton for valuable comments on an earlier version of the manuscript.
Funding
The CuttleBot team was supported by the UC Irvine California Institute for Telecommunications and Information Technology (CALIT2) in collaboration with the UC Irvine Undergraduate Research Opportunities Program (UROP). J.K. was supported in part by National Institute of Neurological Disorders and Stroke award R01 NS135850-02.
Author information
Authors and Affiliations
Contributions
S.K., Q.W., and J.K. designed the experiment. Q.W., K.Z. and J.K. implemented the model. All authors analyzed the results. All authors wrote the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kandimalla, S., Wong, Q.Y., Zheng, K. et al. Episodic-like memory in a simulation of cuttlefish behavior. Sci Rep 16, 2169 (2026). https://doi.org/10.1038/s41598-025-31950-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-31950-x











