Flexible gating between subspaces in a neural network model of internally guided task switching

Liu, Yue; Wang, Xiao-Jing

doi:10.1038/s41467-024-50501-y

Download PDF

Article
Open access
Published: 01 August 2024

Flexible gating between subspaces in a neural network model of internally guided task switching

Nature Communications volume 15, Article number: 6497 (2024) Cite this article

7956 Accesses
5 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Behavioral flexibility relies on the brain’s ability to switch rapidly between multiple tasks, even when the task rule is not explicitly cued but must be inferred through trial and error. The underlying neural circuit mechanism remains poorly understood. We investigated recurrent neural networks (RNNs) trained to perform an analog of the classic Wisconsin Card Sorting Test. The networks consist of two modules responsible for rule representation and sensorimotor mapping, respectively, where each module is comprised of a circuit with excitatory neurons and three major types of inhibitory neurons. We found that rule representation by self-sustained persistent activity across trials, error monitoring and gated sensorimotor mapping emerged from training. Systematic dissection of trained RNNs revealed a detailed circuit mechanism that is consistent across networks trained with different hyperparameters. The networks’ dynamical trajectories for different rules resided in separate subspaces of population activity; the subspaces collapsed and performance was reduced to chance level when dendrite-targeting somatostatin-expressing interneurons were silenced, illustrating how a phenomenological description of representational subspaces is explained by a specific circuit mechanism.

Reconstructing computational system dynamics from neural data with recurrent neural networks

Article 04 October 2023

Distributing task-related neural activity across a cortical network through task-independent connections

Article Open access 18 May 2023

Multiplexed subspaces route neural activity across brain-wide networks

Article Open access 09 April 2025

Introduction

A signature of cognitive flexibility is the ability to adapt to a changing task demand. Oftentimes, the relevant task is not explicitly instructed, but needs to be inferred from previous experiences. In laboratory studies, this behavioral flexibility is termed un-cued task switching. A classic task to evaluate this ability is the Wisconsin Card Sorting Test (WCST)¹. During this task, subjects are presented with an array of cards, each with multiple features, and should respond based on the relevant feature dimension (i.e. the task rule) that changes across trials. Crucially, subjects are not instructed on when the rule changes, but must infer the currently relevant rule based on the outcome of previous trials. This is different from cued rule switching paradigms where an external cue provides information to the subjects about the relevant rule for each trial (e.g.^2,3). Intact performance on un-cued task switching depends on higher-order cortical areas such as the prefrontal cortex (PFC)^4,5,6,7,8, which has been proposed to represent the task rule and modulate the activity of other cortical areas along the sensorimotor pathway⁹.

Four essential neural computations must be implemented by the neural circuitry underlying un-cued task switching. First, it should maintain an internal representation of the task rule across multiple trials when the rule is unchanged. Second, soon after the rule switches, the animal will inevitably make errors and receive negative feedback, since the switches are un-cued. This negative feedback should induce an update to the internal representation of the task rule. Third, the neural signal about the task rule should be communicated to the brain regions responsible for sensory processing and action selection. Fourth, this rule signal should be integrated with the incoming sensory stimulus to produce the correct action. In contrast, in cued rule switching tasks, the first two computations (rule maintenance and rule updating) are not required, since an external cue about the rule is provided during every trial.

Prior work has identified neural correlates of cognitive variables presumed to underlie these computations including rule¹⁰, feedback^10,11,12 and conjunctive codes for sensory, rule, and motor information¹³. In addition, different types of inhibitory neurons are known to play different functional roles in neural computation: while parvalbumin (PV)-expressing interneurons are suggested to underlie feedforward inhibition¹⁴, interneurons that express somatostatin (SST) and vasoactive intestinal peptide (VIP) have been proposed to mediate top-down control^15,16,17,18. In particular, SST and VIP neurons form a disinhibitory motif^19,20,21 that has been hypothesized to instantiate a gating mechanism for flexible routing of information in the brain²². However, there is currently a lack of mechanistic understanding of how these neural representations and cell-type-specific mechanisms work together to accomplish un-cued task switching.

To this end, we used computational modeling to formalize and discover mechanistic hypotheses. In particular, we used tools from machine learning to train a collection of biologically-informed recurrent neural networks (RNNs) to perform an analog of the WCST used in monkeys^10,11,23. Training RNN²⁴ does not presume a particular circuit solution, enabling us to explore potential mechanisms. For this purpose, it is crucial that the model is biologically plausible. To that end, each RNN was set up to have two modules: a “PFC” module for rule maintenance and switching and a “sensorimotor” module for executing the sensorimotor transformation conditioned on the rule. To explore the potential functions of different neuronal types in this task, each module of our network consists of excitatory neurons with somatic and dendritic compartments as well as PV, SST and VIP inhibitory neurons, where the connectivity between cell types is constrained by experimental data (Methods).

After training, we performed extensive dissection of the trained models to reveal that close interplay between local and across-area processing was essential for solving the WCST. First, we found that abstract cognitive variables were distinctly represented in the PFC module. In particular, two subpopulations of excitatory neurons emerge in the PFC module - one encodes the task rule and the other shows mixed-selectivity that nonlinearly depends on rule and negative feedback. Notably, neurons with similar response profiles have been reported in neurophysiological recordings of monkeys performing the same task^10,11. Second, we identified interesting structures in the local connectivity between different neuronal assemblies within the PFC module, which enabled us to compress the high-dimensional PFC module down to a low-dimensional simplified network. Importantly, the neural mechanism for maintaining and switching rule representation is readily interpretable in the simplified network. Third, we found that the rule information in the PFC module is communicated to the sensorimotor module via structured long-range connectivity patterns along the monosynaptic excitatory pathway, the di-synaptic pathway that involves PV neurons, as well as the trisynaptic disinhibitory pathway that involves SST and VIP neurons. In addition, different dendritic branches of the same excitatory neuron in the sensorimotor module can be differentially modulated by the task rule depending on the sparsity of the local connections from the dendrite-targeting SST neurons. Fourth, single neurons in the sensorimotor module show nonlinear mixed selectivity to stimulus, rule and response, which crucially depends on the activity of the SST neurons. On the population level, the neural population activity in the sensorimotor module during different task rules occupy nearly orthogonal subspaces, which is disrupted by silencing the SST neurons. Lastly, we found structured patterns of input and output connections for the sensorimotor module, which enables appropriate rule-dependent action selection. These results are consistent across dozens of trained RNNs with different types of dendritic nonlinearities and initializations, therefore pointing to a common neural circuit mechanism underlying the WCST.

Results

Training modular recurrent neural networks with different types of inhibitory neurons

We trained a collection of modular RNNs to perform the WCST. Each RNN consists of two modules: the “PFC” module receives an input about the outcome of the previous trial, and was trained to output the current rule; the “sensorimotor” module receives the sensory input and was trained to generate the correct choice (Fig. 1b). The inputs and outputs were represented by binary vectors (Fig. 1b, Methods). Each module was endowed with excitatory neurons with somatic and two dendritic compartments, as well as three major types of genetically-defined inhibitory neurons (PV, SST and VIP). Different types of neurons have different inward and outward connectivity patterns constrained by experimental data in a binary fashion (Methods, Fig. 1b). The somata of all neurons were modeled as standard leaky units with a rectified linear activation function. The activation of the dendritic compartments, which can be viewed as a proxy for the dendritic voltage, is a nonlinear sigmoidal function of the excitatory and inhibitory inputs they receive (Methods). The specific form of the nonlinearity is inspired by experiments showing that inhibition acts subtractively or divisively on the dendritic nonlinearity function depending on its relative location to the excitation along the dendritic branch²⁵. Therefore, we trained a collection of RNNs, each with either subtractive or divisive dendritic nonlinearity, to explore the effect of dendritic nonlinearity on the network function.

**Fig. 1: Model setup and performance.**

The task we trained the network on is a WCST variant used in monkey experiments^8,10,11,23 (Fig. 1a). During each trial, a reference card with a particular color and shape is presented on the screen for 500 ms, after which three test cards appear around the reference card for another 500 ms. Each card can have one of the two colors (red or blue) and one of the two shapes (square or triangle). A choice should be made for the location that contains the test card that has the same relevant feature (color or shape) as the reference card, after which the outcome of the trial is given, followed by an inter-trial interval. The relevant feature to focus on, or the task rule, changes randomly every few trials. Critically, the rule changes were not cued, requiring the network to memorize the rule of the last trial using its own dynamics. Therefore, the network dynamics should be carried over between consecutive trials, rather than reset at the end of each trial as has been done traditionally^2,26. To this end, the network operated continuously across multiple trials during training, and the loss function was aggregated across multiple trials (Methods). We used supervised learning to adjust the strength of all the connections (input, recurrent and output) by minimizing the mean squared error between the output of both modules and the desired output (rule for the PFC module and response for the sensorimotor module). Notably, only the connections between certain cell types are non-zero and can be modified. This is achieved using a mask matrix, similar to ref. ²⁷ (Methods).

After training converged, we tested the models by running them continuously across 80 trials of WCST with rule switches at randomly chosen trials. The networks made a single error after each rule switch, and were able to quickly switch to the new rule and maintain good performance (Fig. 1c, d). Correspondingly, single neurons from both modules exhibited rule-modulated persistent activity that lasted several trials (Supplementary Fig. 1).

Our networks can reliably maintain good performance after a single correct trial in the new rule, which matches the behavior of monkeys in some previous studies (e.g. Fig. 1e). However, the performance of monkeys during this task showed substantial variability across different studies as well as different sessions with a study. The number of error trials that monkeys take to switch to the new rule ranges from one trial to tens of trials^8,10,11,23. One reason is that performance typically reaches a certain criterion (e.g. 80% correct) but not 100% before rule switching, therefore an error signal could mean an erroneous sensori-motor transformation rather than rule change. Indeed, when training of our model was stopped at 80% rather than 100% accuracy, the resulting network showed gradual switching (Fig. 1f, Methods). This point will be addressed further in the Discussion section. In the following sections, we will “open the black box” to understand the mechanism the networks used to perform the WCST.

Two rule attractor states in the PFC module maintained by interactions between modules

We first dissected the PFC module, which was trained to represent the rule. Since there are two rules in the WCST task we used, we expected that the PFC module might have two attractor states corresponding to the two rules. Therefore, we examined the attractor structure in the dynamical landscape of the PFC module by initializing the network at states chosen randomly from the trial, and evolving the network autonomously (without any input) for 500 time steps (which equals 5 seconds in real time). Then, the dynamics of the PFC module during this evolution was visualized by applying principal component analysis to the population activity. The PFC population activity settled into one of two different attractor states depending on the rule that the initial state belongs to (Supplementary Fig. 2a). Therefore, there are two attractors in the dynamical landscape of the PFC module, corresponding to the two rules.

Historically, persistent neural activity corresponding to attractor states were first discovered in the PFC^28,29,30,31. However, more recent experiments found persistent neural activity in multiple brain regions, suggesting that long-range connections between brain regions may be essential for generating persistent activity^32,33,34,35. Inspired by these findings, we wondered if the PFC module in our network could support the two rule attractor states by itself, or that the long-range connections between the PFC and the sensorimotor module are necessary to support them. To this end, we lesioned the inter-modular connections in the trained networks and repeated the simulation. Interestingly, we found that for the majority of the trained networks (42 out of 52 for the fast switching networks and 28 out of 28 for the slow switching networks), their PFC activity settled into a trivial fixed point corresponding to an inactive state (Supplementary Fig. 2b, c). This result shows that the two rule attractor states in these networks are dependent on the interactions between the PFC and the sensorimotor modules, although a significant part of the excitation still comes from local populations (Supplementary Fig. 2d).

Two emergent subpopulations of excitatory neurons in the PFC module

For the PFC module to keep track of the currently valid rule, the module should stay in the same rule attractor state after receiving positive feedback, but transition to the other rule attractor state after receiving negative feedback. We reasoned that this network function might be mediated by single neurons that are modulated by the task rule and negative feedback, respectively. Therefore, we set out to look for these single neurons.

In the PFC module of the trained networks, there are indeed neurons whose activity is modulated by the task rule in a sustained fashion (example neurons in Supplementary Fig. 1 and Fig. 2a, top). In contrast, there are also neurons that show transient activity only after negative feedback. Furthermore, this activity is also rule-dependent. In other words, their activity is conjunctively modulated by negative feedback and the task rule (example neurons in Supplementary Fig. 1, red trace and Fig. 2a, bottom). We termed these two classes of neurons “rule neurons” and “conjunctive error x rule neurons” respectively.

**Fig. 2: Emergence of two subpopulations of excitatory neurons in the PFC module after training.**

We identified all the rule neurons and conjunctive error x rule neurons in the PFC module using a single neuron selectivity measure (see Methods for details). The two classes of neurons are clearly separable on the two-dimensional plane in Fig. 2c, where the x axis is the input weight for negative feedback, and the y axis is the rule modulation, which is the difference in the mean activity between the two rules (for trials following a correct trial). As shown in Fig. 2c, rule neurons receive negligible input about negative feedback, and many of them have activity modulated by rule. On the other hand, conjunctive error x rule neurons receive a substantial amount of input about negative feedback, yet their activity is minimally modulated by rule on trials following a correct trial (Fig. 2b). This pattern was preserved when aggregating across trained networks (Fig. 2c and Supplementary Fig. 3). Interestingly, neurons with similar tuning profiles have been reported in the PFC and posterior parietal cortex of macaque monkeys performing the same WCST analog^10,11.

Across different cell types in the PFC module, on average 23.1% of excitatory neurons, 57.3% of PV neurons and 38.1% of SST neurons were classified as rule neurons in each model. Compared to excitatory neurons, a much smaller fraction of inhibitory neurons in the PFC were classified as conjunctive error x rule neurons. On average, 22.9% excitatory neurons were conjunctive error x rule neurons in each model, compared with 11.5% PV neurons and 5.2% SST neurons. Therefore, we focus only on the excitatory conjunctive error x rule neurons in the analysis below.

We also performed the same analysis on the trained networks that switch rules more slowly (e.g. Fig. 1f). In those networks there is a weaker separation between the two subpopulations of excitatory neurons in the PFC module (Supplementary Fig. 6a).

Maintaining and switching rule states via structured connectivity patterns between subpopulations of neurons within the PFC module

Given the existence of rule neurons and conjunctive error x rule neurons, what is the connectivity between them that enables the PFC module to stay in the same rule attractor state when receiving correct feedback, and switch to the other rule attractor state when receiving negative feedback?

To this end, we examined the connectivity between different subpopulations of neurons in the PFC module explicitly, by computing the mean connection strength between each pair of subpopulations. This analysis reveals that the excitatory rule neurons and PV rule neurons form a classic winner-take-all network architecture³⁶ with selective inhibitory populations^37,38, where excitatory neurons preferring the same rule are more strongly connected, and they also more strongly project to PV neurons preferring the same rule (Fig. 3a). On the other hand, PV neurons project more strongly to both excitatory neurons and other PV neurons with the opposite rule preference (Fig. 3a). This winner-take-all network motif together with the excitatory drive from the sensorimotor module (Supplementary Fig. 2) is able to sustain one of the two attractor states.

**Fig. 3: An emergent circuit wiring diagram in the PFC module enables un-cued switching between rule attractor states.**

Next, how are the rule neurons connected with the conjunctive error x rule neurons such that the sub-network formed by the rule neurons can switch from one attractor to the other in the presence of the negative feedback input? Using the same method, we found that the connectivity between the rule neurons and the conjunctive error x rule neurons exhibited an interesting structure: the excitatory rule neurons more strongly target the conjunctive error x rule neurons that prefer the opposite rule; the PV rule neurons more strongly target conjunctive error x rule neurons that prefer the same rule (Fig. 3b, top two panels). On the other hand, the conjunctive error x rule neurons more strongly target the excitatory and PV rule neurons that prefer the same rule (Fig. 3b, bottom two panels).

This connectivity structure gives rise to a simple circuit diagram of the PFC module (Fig. 3c), which leads to an intuitive explanation of the circuit mechanism underlying the switching of rule attractor state. For example, suppose the network is in the attractor state corresponding to color rule, and has just received a negative feedback and is about to switch to the attractor corresponding to the shape rule (Fig. 3e, left). As shown in Fig. 2b, c, the input current that represents the negative feedback mainly targets the conjunctive error x rule neurons. In addition, since the network is in the color rule state, the excitatory and PV neurons that prefer the color rule are more active than those that prefer the shape rule. According to Fig. 3b (top two panels), the excitatory neurons that prefer the color rule strongly excite the error x shape rule neurons, and the PV neurons that prefer the color rule strongly inhibit the error x color rule neurons. Therefore, the error x shape rule neurons receive stronger total excitation than the error x color rule neurons, and will be more active (Fig. 3e, middle). Their activation will in turn excite the excitatory neurons and PV neurons that prefer the shape rule (Fig. 3b, bottom two panels). Finally, due to the winner-take-all connectivity between the rule populations (Fig. 3a), the excitatory and PV neurons that prefer the color rule will be suppressed, and the network will transition to the attractor state for the shape rule (Fig. 3e, right).

It is worth noting that the same mechanism can also trigger a transition in the opposite direction (from shape rule to color rule) in the presence of the same negative feedback signal. This is enabled by the biased connections between the rule and conjunctive error x rule populations.

Is the simplified circuit diagram (Fig. 3c) consistent across trained networks, or different trained networks use different solutions? To examine this question, we computed a “connectivity bias” measure between each pair of populations for each trained network. This measure is greater than zero if the connectivity structure between a pair of populations is closer to the one in the simplified circuit diagram in Fig. 3c than to the opposite (see Methods for details). Across trained networks, we found that the connectivity biases were mostly greater than zero (Fig. 3d), indicating that the same circuit motif for rule maintenance and switching underlies the PFC module across different trained networks.

A similar circuit architecture exists between the excitatory neurons and the SST neurons in the PFC module (Supplementary Fig. 5), where SST neurons receive stronger excitatory input from the conjunctive error x rule neurons that prefer the same rule, and also more strongly inhibit the error x rule neurons that prefer the same rule. In addition, they form a winner-take-all connectivity with the rule excitatory neurons by receiving stronger projections from the rule neurons that prefer the same rule and projecting back more strongly to the rule neurons that prefer the opposite rule. Therefore, they contribute to rule maintenance and switching in a similar way as the PV neurons.

When the same analysis was performed on the slow-switching networks (e.g. Fig. 1f), we found that although the rule neurons in these networks also form a winner-take-all connectivity structure, the connectivity between error x rule neurons and rule neurons does not exhibit the same structure as in the fast-switching models (Supplementary Fig. 6b, c). Therefore, the slow-switching networks have a similar sub-network that encodes the rule, but a poorly organized sub-network between the error x rule and rule neurons, which may explain why switching the rule takes more trial-and-error in these networks.

Top-down propagation of the rule information through structured long-range connections

Given that the PFC module can successfully maintain and update the rule representation, how does it use the rule representation to reconfigure the sensorimotor mapping? First, we found that neurons in the sensorimotor module were tuned to rule (Supplementary Fig. 7a), since they receive top-down input from the rule neurons in the PFC module. The PFC module exerts top-down control through three pathways: the monosynaptic pathway from the excitatory neurons in the PFC module to the excitatory neurons in the sensorimotor module, the tri-synaptic pathway that goes through the VIP and SST neurons in the sensorimotor module, and the di-synaptic pathway mediated by the PV neurons in the sensorimotor module (Fig. 1b). We found that there are structured connectivity patterns along all three pathways. Along the monosynaptic pathway, excitatory rule neurons in the PFC module preferentially send long-range projections to the excitatory neurons in the sensorimotor module that prefer the same rule (Fig. 4a). Along the tri-synaptic pathway, PFC excitatory rule neurons also send long-range projections to the SST and VIP interneurons in the sensorimotor module that prefer the same rule (Fig. 4b, c). The SST neurons in turn send stronger inhibitory connections to the dendrite of the local excitatory neurons that prefer the opposite rule (Fig. 4d). Along the di-synaptic pathway, the PV neurons are also more strongly targeted by PFC excitatory rule neurons that prefer the same rule (Fig. 4e), and they inhibit more strongly the local excitatory neurons that prefer the opposite rule (Fig. 4f). These trends are mostly preserved across trained networks (Supplementary Fig. 8). Therefore, rule information is communicated to the sensorimotor module synergistically via the mono-synaptic excitatory pathway, the tri-synaptic pathway that involves the SST and VIP neurons, as well as the di-synaptic pathway that involves the PV neurons, as illustrated in Fig. 4g.

**Fig. 4: Structured top-down connections enable the propagation of the rule information.**

Structured input and output connections of the sensorimotor module enable rule-dependent action selection

Given the top-down rule information from the PFC module, how does the sensorimotor module implement the sensorimotor transformation (from the cards to the response to one of the three spatial locations)? We sought to identify the structures in the input, recurrent and output connections of the sensorimotor module that give rise to this function.

We started by observing that excitatory neurons in the sensorimotor module showed a continuum of encoding strengths for task rule, response location and card features, and many neurons show conjunctive selectivity for these variables (Fig. 5b, Supplementary Fig. 7b). Therefore, we assigned each excitatory neuron in the sensorimotor module a preferred rule R, a preferred response location L and a preferred shared feature F, which is the feature that is present in both the reference card and the test card at L. For example, neurons with R = color rule, L = 1 and F = blue have the highest activity during color rule trials when the correct response is to choose the test card at location 1, and when that test card shares the blue color with the reference card (it belongs to the population with the filled green color in Fig. 5a). Intuitively, for this group of neurons to show such selectivity, they should receive strong input from the input neurons that encode the F = blue feature of the test card at location L = 1 and the same feature for the reference card. This would enable them to detect when the test card at L = 1 and the reference card both have the feature F = blue.

**Fig. 5: Structures in the input and output weights of the sensorimotor module enable rule-dependent action selection.**

In general, for neurons that prefer rule R, response location L and shared feature F, we can define their “preferred features” as the feature F of the reference card and the same feature F for the test card at location L. Across all excitatory neurons in the sensorimotor module, we found that the connections from the input neurons that encode these preferred features were significantly stronger than the connections from the input neurons that encode other features (Fig. 5c). In addition, there is also an intuitive structure in the output connections, where excitatory neurons in the sensorimotor module that prefer a particular response location send stronger connections to the output neuron that represents the same response location (Fig. 5d). These patterns were found to be consistent across trained networks (Supplementary Fig. 9).

These structures in the input and output connections give rise to an intuitive explanation of how the sensorimotor module can perform rule-dependent action selection required for the WCST. Here we illustrate this mechanism with an example trial (Fig. 5a), where the current rule is color, the reference card is a blue circle, and the test cards at locations 1, 2 and 3 are blue triangle, red circle and red triangle, respectively. According to the rule of WCST, the correct response should be location 1, since the test card at that location matches the reference card in color. This choice can be generated as follows: the excitatory population in the sensorimotor module that prefers R = color rule, L = 1 and F = blue will be most strongly activated since they not only receive strong top-down excitatory input from the PFC module, but also the strongest excitatory input from the input neurons. Therefore, they are the most strongly activated population (Fig. 5a). Since they prefer response location 1, they will activate the output neuron that prefers response location L = 1, which is the correct response.

Recurrent connectivity and dynamics within the sensorimotor module

Given that different populations of neurons in the sensorimotor module receive differential inputs about the external sensory stimuli and rule via the structured input and top-down connections, how are they recurrently connected to produce dynamics that lead to a categorical choice? To answer this, we first visualized the population neural dynamics in the sensorimotor module by using principal component analysis (Fig. 6a, b). As shown in Fig. 6a, neural trajectories during the inter-trial interval are clustered according to the task rule. During the response period, the neural trajectories are separable according to the response locations, albeit only in higher-order principal components (Fig. 6b). In addition, the subspaces spanned by neural trajectories of different rules and response locations are more orthogonal to each other compared to randomly shuffled data (Fig. 6c, d, Methods).

**Fig. 6: Recurrent dynamics and connectivity within the sensorimotor module.**

What connectivity structure gives rise to this signature in the population dynamics? To answer this, we examined the pattern of connection weights between excitatory and PV neurons that prefer different rules (R), response locations (L), and shared features (F) by computing the connectivity biases between populations of neurons that are selective to different rules (Fig. 6e), response locations (Fig. 6f) and shared features (Fig. 6g). A greater than zero connectivity bias means populations that prefer different rules (or response locations or shared features) form a winner-take-all circuit structure analogous to the one observed between rule-selective populations in the PFC module (c.f. top panel of Fig. 3d, details about how the connectivity biases were computed is described in Methods). We observed that many of the connectivity biases were significantly above zero (Fig. 6e–g), especially for the ones that correspond to the inhibitory connections originating from the PV neurons. This indicates that populations of neurons in the sensorimotor module that are selective to different rules, response locations and shared features overall inhibit each other. This mutual inhibition circuit structure magnifies the difference in the amount of long-range inputs that different populations receive (Fig. 5) and leads to a categorical choice.

SST neurons are essential to dendritic top-down gating

The previous sections elucidate the key connectivity structures that enable the network to perform the WCST. In this final section we are going to take advantage of the biological realism of the trained RNN and examine the function of SST neurons in this model.

It has been observed that different dendritic branches of the same neuron can be tuned to different task variables^39,40,41,42. This property may enable individual dendritic branches to control the flow of information into the local network^19,22. Given these previous findings, we examined the coding of the top-down rule information at the level of individual dendritic branches. Since each excitatory neuron in our networks is modeled with two dendritic compartments, we examined the encoding of rule information by different dendritic branches of the same excitatory neuron in the sensorimotor module.

One strategy of gating is for different dendritic branches of the same neuron to prefer the same rule, in which case these neurons form distinct populations that are preferentially recruited under different task rules (population-level gating, Fig. 7a, right). An alternative strategy is for different dendritic branches of the same neuron to prefer different rules, which would enable these neurons to be involved in both task rules (dendritic branch-specific gating, Fig. 7a, left).

**Fig. 7: Examining the role of SST neurons in the sensorimotor module in top-down gating.**

In light of this, we examined for our trained networks to what extent they adopt these strategies. We found that the rule selectivity between different dendritic branches of the same neuron were highly correlated (Fig. 7b). This indicates that the trained networks are mostly using the population-level gating strategy, where different dendritic branches of the same neuron encode the same rule.

What factors might determine the extent to which the trained networks adopt these two strategies? Previous modeling work suggests that sparse connectivity from SST neurons to the dendrites of the excitatory neurons increases the degree of dendritic branch-specific gating, in the case where the connectivity is random (Fig. 4f in ref. ²²). To see if the same effect is present in our task-optimized network with structured connectivity, we re-trained networks with different levels of sparsity from 0 to 0.8 and studied its effect on the dendritic branch specificity of rule coding (Methods). We found that the degree of dendritic branch-specific encoding of the task rule increased with sparsity (see Fig. 7c, d for subtractive dendritic nonlinearity; Supplementary Fig. 11a for divisive dendritic nonlinearity). Intuitively, when the connection is sparse, a smaller number of SST neurons target each dendritic branch, making it more likely that the branch receives an uneven number of inputs from SST neurons selective for different rules. Taken together, we observed that the trained networks adopted a mixture of population-level and dendritic-level gating strategies for top-down control, and the balance between the two strategies depended on the sparsity of the connections from the SST neurons to the dendrites of excitatory neurons.

Indeed, SST neurons play a causal role in relaying the top-down rule information into the sensorimotor network and reconfiguring its dynamics according to the task rule. We simulated optogenetic inhibition by silencing the SST neurons in the sensorimotor module, which significantly impaired task performance (Fig. 7e, see Methods section for details of the protocol). In addition, the principal angle between the subspaces for different rules (Fig. 6c) significantly decreased after SST neurons in the sensorimotor module were silenced for networks with subtractive dendritic nonlinearity (Fig. 7f). This effect was not significant for networks with divisive dendritic nonlinearity (Supplementary Fig. 11c). Silencing of the SST neurons in the sensorimotor module also significantly diminished nonlinear mixed-selective coding of rule and stimulus among the excitatory neurons in the sensorimotor module (Fig. 7g, Supplementary Fig. 12, Methods), which has been proposed to be important for rule-based sensorimotor associations^43,44,45. Taken together, these results highlight the role that SST neurons in the sensorimotor module play during top-down control. This analysis also shows that by combining artificial neural network with knowledge from neurobiology, it is possible to probe the functions of fine-scale biological components in cognitive behaviors.

Discussion

In this paper, we analyzed recurrent neural networks trained to perform a classic task involving un-cued task switching—the Wisconsin Card Sorting Test. The networks consist of a “PFC” module trained to represent the rule and interacts with a “sensorimotor” module that instantiates different sensorimotor mappings depending on the rule. In order to study the functions of dendritic computation and different neuronal types, each module is endowed with excitatory neurons with two dendritic branches as well as three major types of inhibitory neurons—PV, SST and VIP. After training, we dissected the trained networks to elucidate a number of intra-areal and inter-areal neural circuit mechanisms underlying WCST, as summarized in Fig. 8.

**Fig. 8: A summary of the main results.**

Mapping between model components and brain regions

Different components of the trained network can be mapped to different brain regions (Fig. 8). While single neurons in the dorsal-lateral PFC (DLPFC) are shown to encode the task rule⁴⁶, neurons in the anterior cingulate cortex (ACC) are thought to be important for performance monitoring⁴⁷, and have been shown to receive more input about the feedback^48,49,50,51. Therefore, the rule neurons and conjunctive error x rule neurons in the model correspond to the putative functions of the neurons in DLPFC and ACC. The input to the PFC module about negative feedback may come from subcortical areas such as the amygdala⁵² or from the dopamine neurons in the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA)^53,54. The sensorimotor module may correspond to parietal cortex or basal ganglia which have been shown to be involved in sensorimotor transformations^55,56. The neurons in the input layer that encode the color and shape of the card stimuli exist in higher visual areas such as the inferotemporal cortex^57,58,59. The neurons in the output layer that encode different response locations could correspond to movement location-specific neurons in the motor cortex⁶⁰.

Attractor states supported by inter-areal connections

We observed that in many networks, the interaction between the two modules was needed to sustain the two rule attractor states (Supplementary Fig. 2b, c), although the majority of the excitatory input to PFC neurons come from local population (Supplementary Fig. 2d). Traditionally, it was thought that local interactions within the frontal cortex are sufficient for the maintenance of the persistent activity^28,29,30,31. Recent large-scale electrophysiological recordings, on the other hand, revealed highly distributed encoding of cognitive variables^{32,34,61,62,63,64}. In addition, distributed patterns of persistent activity emerge in neural network models of multiple brain regions that are constrained by anatomical and neurophysiological data^65,66. Despite the empirical evidence, the functional advantages of this multi-areal encoding scheme remain an open question.

Circuit mechanism in the frontal-parietal network for rule maintenance and update

We found that the PFC module of the trained networks used a particular mechanism for maintaining and updating the rule that depends on the interaction between two distinct populations of neurons that emerge in the PFC module as a result of training: neurons that only encode the rule, and neurons that conjunctively encode negative feedback and rule (Fig. 3c). Neurons that show conjunctive selectivity for rule and negative feedback have been reported in monkey prefrontal and parietal cortices while they perform the same WCST task^10,11. Theoretical work suggests that these mixed-selective neurons are essential if the network needs to switch between different rule attractor states after receiving the same input that signals negative feedback⁶⁷. In addition, this circuit mechanism is consistent across dozens of trained networks with different initializations and dendritic nonlinearities (Fig. 3d and Supplementary Fig. 4).

This circuit mechanism bears resemblance to a previous circuit model of WCST⁶⁸. In that model, different rules are maintained by different “rule-coding clusters” that have self-excitatory connections and lateral inhibitory connections, similar to the winner-take-all attractor network between the rule neurons in our trained networks. On the other hand, these two models use different mechanisms for rule shifting: while in the Dehaene and Changeux model it is achieved via synaptic desensitization caused by the convergence of two inputs (one that signals the recent activation of the synapse, and another that signals negative feedback), our trained networks use structured connections between rule neurons and a separate population of conjunctive error x rule to shift between rules states (c.f. Fig. 3c). Notably, neurons that represent the conjunction of negative feedback and rule have been reported in the frontoparietal cortices of monkeys performing WCST^10,11.

The simplified circuit for the PFC module in Fig. 3c can be applied not only to rule switching, but to the switching between other behavioral states as well. For example, it resembles the head-direction circuit in fruit fly⁶⁹, where the offset in the connections between the neurons coding for head direction and those coding for the conjunction of angular velocity and head direction enables the circuit to update the head-direction attractor state using the angular velocity input. In addition, this circuit structure may underlie the transition from staying to switching during patch foraging behavior. Indeed, in a laboratory task mimicking natural foraging for monkeys, it was found that neurons in the anterior cingulate cortex increase their firing rates to a threshold before animals switch to another food resource⁷⁰, similar to the conjunctive error x rule neurons in our networks.

Connecting subspace to circuits

Methods that describe the representation and dynamics on the neuronal population level have gained increasing popularity and generated insights that cannot be discovered using single neuron analysis (e.g.^60,71). In the meantime, it would be valuable to connect population-level phenomena to their underlying circuit basis⁷². In our model, we found that silencing of the SST neurons has a specific effect on the population-level representation, namely, it decreased the angle between rule subspaces (Fig. 7f). Interestingly, this effect was much more prominent in networks with subtractive dendritic inhibition (Fig. 7f) than divisive dendritic inhibition (Supplementary Fig. 11c). Activating SST neurons can produce both forms of dendritic inhibition, depending on the exactly protocol used^73,74,75. Future work is required to understand the mechanistic link between different forms of dendritic inhibition and the geometry of neural representation.

We also found that silencing the other types of inhibitory neurons in the model has different effects. Silencing the PV neurons led to runaway activity in many networks (Supplementary Fig. 11e). Silencing the VIP neurons, on the other hand, caused no significant changes of the task performance (Supplementary Fig. 11f). The lack of effect after silencing the VIP neurons is due to the fact that the VIP neurons were largely inhibited by the SST neurons in the trained networks. It is conceivable that VIP neurons would play a more important role if additional constraints are introduced in training RNN to produce a robust disinhibitory motif as observed experimentally^{16,19,20,21,76,77}. Future work could study the function of VIP neurons under more realistic connectivity constraints between different cell types (e.g.⁷⁸).

Cued versus un-cued rule switching

The task our networks were trained on is an example of un-cued rule switching task, where the rule is not explicitly cued on each trial but needs to be inferred from the feedback of previous trials. This differs from cued rule switching, another commonly studied rule-based task paradigm where the task rule is explicitly cued during each trial (e.g.^2,3). Compared to cued rule switching, additional mechanisms and neural circuitries responsible for inferring the rule shift and maintaining the relevant rule are needed to perform un-cued rule switching (although monkeys could employ alternative strategies that require less cognitive load for rule maintenance, as mentioned in ref. ⁷⁹).

This difference in the required neural mechanisms leads to a difference of task performance between cued and un-cued rule switching: while it is possible to shift rules with few errors in cued rule switching tasks, such behavior is highly demanding in un-cued rule switching tasks^8,10. Below we will further discuss the limitation of our work in terms of the behavior during rule switching.

Dynamics of behavior during un-cued rule switching

The fast-switching networks in this study switch rules in just one trial (Fig. 1c, d). This fast switching agrees with the monkey behavior in some studies^11,23,79, but other studies report that monkeys switch rules using on average tens of trials^8,10. For example, in ref. ⁸, monkeys’ performance is at chance after a single error (Fig. 3D in ref. ⁸), and they gradually use positive feedback to reinforce their behavior according to the new rule (Fig. 4A in ref. ⁸). When our network model was trained to achieve less than perfect accuracy, switching after a rule change now takes a few trials (Fig. 1f) similar to behavioral observations of many monkey experiments. In this case, the rule-selective neurons in the PFC module still form a winner-take-all attractor network, but their connectivity pattern with the conjunctive error x rule neurons is not as clear cut (Supplementary Fig. 6).

Indeed, in WCST and related rule switching paradigms, subjects’ performance is often not perfect even during trials when the rule is fixed. This is possibly because during training, the rule is switched when the subjects’ performance reaches a certain criterion (e.g. 85% correct in a sequence of 20 trials in ref. ⁸). In that case, negative feedback can due to either a rule switch or the inaccuracy in the sensorimotor transformation (even under the correct rule). Therefore, subjects need to integrate information across several trials to decide whether the rule has actually switched. For example, Purcell and Kiani⁸⁰ analyzed the behavior of humans in an environment switching task. The task has a similar structure to the WCST analog used in this study, except that the noise level in the stimuli varies from trial to trial. It was shown that the behavior of subjects can be well described by a Bayesian ideal observer model, where the evidence towards an environment switch is incremented whenever the subjects make an error, and the amount of increase depends on the difficulty of the error trial: the easier the error trial is, the more likely that the environment has switched and the larger the incremental evidence towards an environment switch is.

Aside from the difficulty of the task under a fixed rule, the relationship between the different rules may also play a role in how fast animals can switch between them. In tasks that involve simple reversal of motor response or sensorimotor mappings, monkeys usually use a small number of trials to switch between rules^81,82,83. On the other hand, for WCST with more than two rules, as is usually used for humans, subjects typically use more trials to switch to the new rule^1,84.

There are other reasons that may contribute to the suboptimality of behavior during rule switching, including random exploration⁸⁴, poor sensitivity to negative feedback⁸⁴, integration of reward history across multiple trials^12,80,85,86, the gradual update of the value of the counterfactual rule⁸⁷ or the cost of cognitive control⁸⁸. Neuronal mechanisms on longer timescales such as synaptic mechanisms⁸⁹ may be required to produce the slow switching behavior.

Other limitations of the work

The SST neurons in the PFC module of our networks are targeted by the local VIP neurons. However, since the VIP neurons in our setup do not receive any excitatory input, their inhibition onto SST neurons is always released. Indeed, inhibiting the PFC VIP neurons in our the networks did not significantly affect task performance (Supplementary Fig. 5f). In reality, VIP neurons in the PFC module receive long-range connections from the mediodorsal thalamus⁹⁰ which carries information about task context⁹¹. In addition, their activity is also strongly modulated by negative reinforcement¹⁵ as well as arousal level⁹², which suggest that they receive long-range projections from subcortical regions such as ventral tegmental area, locus coeruleus⁹³ and the raphe nucleus⁹⁴. Including realistic models of these subcortical regions into the current model is an interesting direction for future work.

In conclusion, our approach of incorporating neurobiological knowledge into training RNNs can provide a fruitful way to build circuit models that are functional, high-dimensional, and reflect the heterogeneity of biological neural networks. In addition, dissecting these networks can make useful cross-level predictions that connect biological ingredients with circuit mechanisms and cognitive functions.

Methods

Model setup

The RNN consists of two bidirectionally-connected modules, the PFC module and the sensorimotor module. Each module consists of 70 excitatory neurons and 30 inhibitory neurons. Each excitatory neuron has 2 dendritic compartments. The inhibitory neurons are evenly divided into three types: PV, SST and VIP. Different types of neurons have different connectivity, inspired by experimental findings^77,95: PV neurons target the somatic compartment of excitatory neurons and other PV neurons, SST neurons target the dendritic compartment of excitatory neurons as well as PV and VIP neurons, and VIP neurons target SST neurons. Excitatory neurons target other excitatory neurons, PV and SST neurons. The connection strength between all other types of neurons were fixed at zero throughout training.

Only excitatory neurons send long-range projections to other modules. The long-range projections from the sensorimotor module to the PFC module target the dendritic compartment of the excitatory neurons and the PV neurons. This is inspired by the experimental evidence that PV neurons mediate feedforward inhibition¹⁴. The long-range top-down projections from the PFC to the sensorimotor module target the dendritic compartments of the excitatory neurons and all three types of inhibitory neurons. Finally, external inputs to both modules target the dendritic compartment of excitatory neurons and PV neurons.

The dynamics of the somata of the excitatory neurons in the RNN are described by

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{esoma}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{esoma}}}}}}}+{f}_{{{\!{{{\rm{soma}}}}}}}({W}_{{{{{{\rm{esoma}}}}}}\to {{{{{\rm{esoma}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{esoma}}}}}}}+{W}_{{{{{{\rm{PV}}}}}}\to {{{{{\rm{esoma}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PV}}}}}}} \\ +{\sum}_{{{{{{\rm{dendrites}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{dendrite}}}}}}}),$$

(1)

where τ = 100 ms, dt = 10 ms. Somata of excitatory neurons in both the sensorimotor and PFC modules obey the same equation. Here “esoma” stands for the soma of excitatory neurons. ${W}_{{{{{{\rm{esoma}}}}}}\to {{{{{\rm{esoma}}}}}}}^{{{{{{\rm{rec}}}}}}}$ and ${W}_{{{{{{\rm{PV}}}}}}\to {{{{{\rm{esoma}}}}}}}^{{{{{{\rm{rec}}}}}}}$ represent the connection weight matrix between the somata of excitatory neurons and from the local PV neurons to the somata of excitatory neurons, respectively. h_esoma and h_PV are the activity of the somata of excitatory neurons and PV neurons. h_dendrite is the activity of the dendritic compartment. f_soma is the somatic nonlinear activation function which was modeled as a rectified linear function:

$${f}_{\!\!\!{{{{{\rm{soma}}}}}}}=\left\{\begin{array}{ll}x,\quad & x \, > \, 0\\ 0,\quad &\;{{{{{\rm{otherwise}}}}}}\end{array}\right.$$

(2)

The dendritic activity is a nonlinear function of the excitatory and inhibitory inputs.

$${{{{{\bf{{h}}}}}_{{{{{{\rm{dendrite}}}}}}}}}={f}_{{{{{{\rm{dendrite}}}}}}}({{{{{{\bf{I}}}}}}}_{{{{{{\rm{exc}}}}}}},{{{{{{\bf{I}}}}}}}_{{{{{{\rm{inh}}}}}}}).$$

(3)

I_exc is the total excitatory input to the dendrite. It consists of long-range inputs from the input neurons (neurons that encode the feedback for the PFC module and neurons that encode the stimulus for the sensorimotor module) as well as the long-range input from the excitatory neurons in the other module. I_exc = I_in + I_{cross−module}. I_inh is the inhibitory input to the dendrite from the local SST neurons. I_inh = I_SST→edend. Here “edend” stands for the dendrite of excitatory neurons. The functional form of f_dendrite is described in the next section.

The inhibitory neurons are modeled as standard point neurons. Different types of inhibitory neurons receive different input connections. In the sensorimotor (SM) module, the dynamics of PV neurons are described by

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}+{f}_{{{\!\!{{{\rm{soma}}}}}}} \left({W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}\right.\\ +{W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}\\ +{W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}}\\ +{W}_{{{{{{\rm{in}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}{{{{{{\bf{u}}}}}}}_{{{{{{\rm{sensory}}}}}}} \\ \left.+{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}\right),$$

(4)

where ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}$, ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}$, ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}$ are the connection weight matrices between the PV neurons, from local SST neurons to the PV neurons, and from local excitatory neurons to the PV neurons, respectively. W_in→SM,PV is the input weight matrix to the PV neurons, and u_sensory is the input to the sensorimotor module that represents the features about the cards. W_{PFC,esoma→SM,PV} is the top-down connection weight matrix from the excitatory neurons in the PFC module to the PV neurons in the sensorimotor module.

For the SST neurons,

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}+{f}_{{{{{{\rm{soma}}}}}}} \left({W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}\right.\\ +{W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}}\\ \left.+{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}\right),$$

(5)

where ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}$ and ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}$ are the connection weight matrices from local VIP neurons and excitatory neurons to the SST neurons, and W_{PFC,esoma→SM,SST} is the top-down connection weight matrix from the excitatory neurons in the PFC module to the SST neurons in the sensorimotor module.

For the VIP neurons,

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}+{f}_{{{{{{\rm{soma}}}}}}}({W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}} \\ +{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}),$$

(6)

where ${W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{SM}}}}}},{{{{{\rm{VIP}}}}}}}^{{{{{{\rm{rec}}}}}}}$ is the connection weight matrix from the local SST neurons to the VIP neurons, and W_{PFC,esoma→SM,VIP} is the top-down connection weight matrix from the excitatory neurons in the PFC module to the VIP neurons in the sensorimotor module.

The inhibitory neurons in the PFC module are described by similar equations, except only the PV neurons receive long-range bottom-up inputs from the sensorimotor module:

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}+{f}_{{{{{{\rm{soma}}}}}}} \left({W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}\right.\\ +{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}\\ +{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}\\ +{W}_{{{{{{\rm{in}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}{{{{{{\bf{u}}}}}}}_{{{{{{\rm{feedback}}}}}}}\\ \left.+{W}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{PV}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}}\right),$$

(7)

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}}{dt}= -{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}+{f}_{{{{{{\rm{soma}}}}}}} \left({W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{VIP}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{VIP}}}}}}}\right.\\ \left.+{W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}\right),$$

(8)

$$\tau \frac{d{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{VIP}}}}}}}}{dt}=-{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{VIP}}}}}}}+{f}_{{{{{{\rm{soma}}}}}}}({W}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}\to {{{{{\rm{PFC}}}}}},{{{{{\rm{VIP}}}}}}}^{{{{{{\rm{rec}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{SST}}}}}}}),$$

(9)

where u_feedback represents the external input to the PFC module about the feedback of the previous trial.

In practice, we used a mask matrix to enforce the connectivity between different cell types.

$${W}^{{{{{{\rm{rec}}}}}}}=| {\tilde{W}}^{{{{{{\rm{rec}}}}}}}|*M+{W}^{{{{{{\rm{fix}}}}}}},$$

(10)

where ${\tilde{W}}_{{{{{{\rm{rec}}}}}}}$ is the unconstrained connection weight matrix updated by the learning algorithm, M is a matrix consisting of 1, 0 and −1 depending on whether the corresponding connection is excitatory, inhibitory or nonexistent. W^fix implements the fixed coupling between the dendrite and the soma. * represents element-wise product.

Only the somata of excitatory neurons send output connections. The output of each module is generated via a simple linear readout:

$${{{{{{\bf{y}}}}}}}_{{{{{{\rm{SM}}}}}}}={W}_{{{{{{\rm{out}}}}}},{{{{{\rm{SM}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{SM}}}}}},{{{{{\rm{esoma}}}}}}}.$$

(11)

$${{{{{{\bf{y}}}}}}}_{{{{{{\rm{PFC}}}}}}}={W}_{{{{{{\rm{out}}}}}},{{{{{\rm{PFC}}}}}}}{{{{{{\bf{h}}}}}}}_{{{{{{\rm{PFC}}}}}},{{{{{\rm{esoma}}}}}}}.$$

(12)

Both the input and output connection weights were constrained to be positive.

Variations in the model hyperparameters

Dendritic nonlinearities

We trained models with two types of dendritic nonlinearities f_dendrite—subtractive and divisive. They are inspired by in-vitro and computational studies showing different types of inhibitory modulation on the dendritic activity depending on the location of inhibition relative to excitation²⁵. Both types of dendritic nonlinearities are sigmoidal functions of the excitatory input. Under subtractive nonlinearity, as the inhibitory input increases, the turning point of the sigmoid function moves to larger values, consistent with the experimental observation when the inhibitory current is injected at the same location or more distal than the excitation²⁵. For the divisive nonlinearity, the turning point of the sigmoid is not affected by the level of inhibition, but the saturating level of the sigmoid function decreases with the level of inhibition, consistent with the experimental observation when the inhibitory current is injected close to the soma²⁵.

The equations of the different dendritic nonlinearities are given by:

$${f}_{{{{{{\rm{dendrite}}}}}}}^{{{{{{\rm{subtractive}}}}}}}({I}_{{{{{{\rm{exc}}}}}}},{I}_{{{{{{\rm{inh}}}}}}})=\tanh \left({I}_{{{{{{\rm{exc}}}}}}}-{I}_{{{{{{\rm{inh}}}}}}}\right)$$

$${f}_{{{{{{\rm{dendrite}}}}}}}^{{{{{{\rm{divisive}}}}}}}({I}_{{{{{{\rm{exc}}}}}}},{I}_{{{{{{\rm{inh}}}}}}})={k}_{1}(1+\tanh ({I}_{{{{{{\rm{exc}}}}}}}-1))+{k}_{2},$$

where ${k}_{1}=\frac{1}{{e}^{{I}_{{{{{{\rm{inh}}}}}}}}}$ and ${k}_{2}=-1-\tanh (-1)$. The form of the divisive dendritic nonlinearity was specified such that it is divisively modulated by I_inh (even when I_exc=0), and that it is 0 only when both I_exc and I_inh are 0.

Initializations

The connectivity matrices were initialized either using a normal distribution with mean 0 and standard deviation $\sqrt{\frac{2}{N}}$ (where N is the total number of recurrent units) or a uniform distribution between $-\sqrt{\frac{6}{N}}$ and $\sqrt{\frac{6}{N}}$.

Sparsity of the SST → dendrite connectivity in the sensorimotor module

To study how the degree of dendritic branch-specific rule encoding in the sensorimotor module is affected by the sparsity of the connections from SST neurons to the dendrite of excitatory neurons, we varied this sparsity by fixing a fraction of randomly chosen connections to be 0 throughout training. The sparsity levels used were 0, 0.2, 0.4, 0.6 and 0.8.

Random seeds

For each combination of the hyperparameter configuration introduced above (except the sparsity), we trained models using 50 random seeds for Pytorch (other random seeds were fixed). For each sparsity level other than 0, we trained models using 10 random seeds for Pytorch.

Task

The networks were trained on an analog of the Wisconsin Card Sorting Test (WCST) used for monkeys^10,11,23. Each trial starts with the presentation of a “reference card” for 500 ms, after which three “test cards” appear around the reference card for 500 ms. Each card contains an object with a specific color (blue or red) and shape (circle or triangle). Among the three test cards, one of them matches the color of the reference card, another one matches the shape of the reference card, and the third card matches neither feature of the reference card. Depending on the rule (color or shape), the location where the test card has the same color or shape feature as the reference card should be chosen. The choice should be made during the 500 ms when both the reference card and the test cards are presented. At the end of this period, a feedback signal is presented for 100 ms, indicating whether the choice is correct or incorrect. This is followed by a 1 second inter-trial interval.

The task rule switches after a random number of trials, without informing the network. Therefore, the network inevitably makes an error for the first trial after the rule switch since it has not yet received the information that the rule has switched. The network should then adjust its behavior to the new rule by utilizing the feedback signal.

Representation of inputs and outputs

Each card is represented as a four-dimensional binary vector, where different entries represent the presence of the two colors and shapes. The feedback input is a two-dimensional one-hot vector, where the two entries represent positive and negative feedback. The target output for the sensorimotor module is a three-dimensional one-hot vector, where each entry represents one response location on the screen. This target is non-zero only during the 500 ms response period when both the reference card and the test cards are presented. The target output for the PFC module is a two-dimensional one-hot vector, where each entry represents one rule. This target is non-zero during the entire trial.

Training method

During training, the networks ran continuously across 20 consecutive trials with 3 random rule switches. Importantly, the network dynamics were not reset during the inter-trial interval. The loss function was aggregated across the 20 trials.

$$L= {\sum}_{{{{\rm{trial}}}}=1}^{20} {\sum}_{t} \left( || {{{\bf{y}}}}_{{{{\rm{PFC}}}}} ({{{\rm{trial}}}},t)- {\hat{{{\bf{y}}}}}_{{{\rm{rule}}}} ({{{\rm{trial}}}},t) {|| }^{2} \right. \\ \left.+|| {{{\bf{y}}}}_{{{\rm{SM}}}}({{{\rm{trial}}}},t )-{\hat{{{\bf{y}}}}}_{{{\rm{choice}}}} ({{{\rm{trial}}}},t){|| }^{2}\right),$$

(13)

where y_PFC(trial, t) and y_SM(trial, t) are the activity of the readout neurons for the PFC and sensorimotor module at time t in a given trial, respectively. ${\hat{{{{{{\bf{y}}}}}}}}_{{{{{{\rm{rule}}}}}}}({{{{{\rm{trial}}}}}},t)$ is the target output for the PFC module which represents the rule of the current trial. It is a binary vector of dimension 2 where each entry represents one rule. The activation of the entry that represents the correct rule is 1 throughout the entire trial. ${\hat{{{{{{\bf{y}}}}}}}}_{{{{{{\rm{choice}}}}}}}({{{{{\rm{trial}}}}}},t)$ is the target output for the sensorimotor module which represents the correct choice for the current trial. It is a binary vector of dimension 3 where each entry represents one of the three response locations. The activation of the entry that represents the correct choice is 1 during the response period (500 ms when both the reference card and the tests card are shown).

The standard backpropogation through time algorithm⁹⁶ with the Adam optimizer⁹⁷ was used to update all connection weights.

We also used curriculum learning to speed up training. Initially, the stimulus, choice and outcome of the previous trial were all provided to the PFC module as input. This way all the information needed to perform the current trial was contained within the input, and the networks did not need to memorize past trials. During the training phase, the network performed 20 consecutive trials with 3 random rule switches, therefore the maximum performance was 85%. When the training performance reached above 65%, we started testing the network on longer trial sequences (200 consecutive trials with 10 rule switches). The maximum performance during testing was 95%. If the networks reached on average 90% performance during the recent 5 tests, the input about the previous stimulus was removed. When the networks reached 90% performance again, the information about the previous choice information was removed. The networks were then trained until they reached 90% performance.

Lower performance criteria was used for the model trained using early stopping (Fig. 1f). In particular, curriculum training advanced to the next stage when the testing performance reached 80%.

Single neuron selectivity metric

The selectivity index (SI) for rule is defined as

$${{{\mbox{SI}}}}_{{{{{{\rm{rule}}}}}}}=\frac{h({{\mbox{color}}})-h({{\mbox{shape}}})}{| h({{\mbox{color}}})|+| h({{\mbox{shape}}})| },$$

(14)

where h(color) and h(shape) represent the trial-averaged single neuron activity during color rule and shape rule, respectively. Neural activity was first averaged over the inter-trial interval before further being averaged across trials.

The error selectivity is defined similarly

$${{{\mbox{SI}}}}_{{{{{{\rm{error}}}}}}}=\frac{h({{\mbox{after error}}})-h({{\mbox{after correct}}})}{| h({{\mbox{after error}}})|+| h({{\mbox{after correct}}})| },$$

(15)

where h(after error) and h(after correct) are the mean single neuron activity after error and correct trials, respectively. Neural activity was first averaged across the feedback presentation and inter-trial interval periods before being averaged across trials.

The selectivity for response location is defined as

$${{{\mbox{SI}}}}_{{{{{{\rm{response}}}}}}}(L)=\frac{h(L)-h(\bar{L})}{| h(L)|+| h(\bar{L})| },$$

(16)

$${{{\mbox{SI}}}}_{{{{{{\rm{response}}}}}}}=max({{{\mbox{SI}}}}_{{{{{{\rm{response}}}}}}}({L}_{1}),{{{\mbox{SI}}}}_{{{{{{\rm{response}}}}}}}({L}_{2}),{{{\mbox{SI}}}}_{{{{{{\rm{response}}}}}}}({L}_{3})),$$

(17)

where h(L) represents the mean single neuron activity during trials where the network chooses response location L. $h(\bar{L})$ represents the mean activity across trials when the choice of the network is not location L. Therefore for each neuron we can compute three selectivity indices, one for each response location. The maximum of the three indices was taken as the response selectivity of that neuron and was used for Fig. 5b. This selectivity index ranges from 0 to 1. We included neural activity during the response period when computing this selectivity index.

Neurons that prefer color/shape rule were further divided according to their preferred shared feature. The selectivity for the shared feature is defined as

$${{{\mbox{SI}}}}_{{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}}=\frac{h({{\mbox{blue}}})-h({{\mbox{red}}})}{| h({{\mbox{blue}}})|+| h({{\mbox{red}}})| },$$

(18)

for neurons that prefer the color rule, and

$${{{\mbox{SI}}}}_{{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}}=\frac{h({{\mbox{circle}}})-h({{\mbox{triangle}}})}{| h({{\mbox{circle}}})|+| h({{\mbox{triangle}}})| },$$

(19)

for neurons that prefer the shape rule. Here h(blue), h(red), h(circle), h(triangle) represent the mean activity of a neuron across trials when the reference card is blue, red (when the current rule is color), circle or triangle (when the current rule is shape). We included neural activity during the response period when computing this selectivity index.

Classification criteria for different neuronal populations

Each neuron in the PFC module was classified as a “rule neuron” if the absolute value of its rule selectivity was greater than 0.5 and the absolute value of its error selectivity was smaller than 0.5. The rest of the neurons were classified as “error neurons” if their error selectivity was greater than 0.5. Error neurons with greater mean activity during the color rule trials that follow an error trial were defined as error x color rule neurons, and the other error neurons were defined as error x shape rule neurons.

Each neuron in the sensorimotor module was assigned with a preferred rule, response location and shared feature according to the condition during which it has the highest activity. There was no threshold for this classification. Neurons with zero activity during all trials were excluded from the analyses.

Connectivity bias

The connectivity bias (CB) was defined as the difference in the average connection weight between different sub-population of neurons. A positive value indicates an agreement with the simplified circuit diagram (Figs. 3c and 6h). For example, the connectivity bias from the PFC PV neurons to the PFC excitatory neurons is given by

$${{{{{\rm{CB}}}}}}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow \,{{\mbox{PFC E}}})= \bar{W}(\,{{\mbox{PFC PV rule1}}}\longrightarrow {{\mbox{PFC E rule2}}})\\ +\bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC E rule1}}})\\ - \bar{W}(\,{{\mbox{PFC PV rule1}}}\longrightarrow {{\mbox{PFC E rule1}}})\\ - \bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC E rule2}}}),$$

(20)

where for example $\bar{W}(\,{{\mbox{PFC PV rule1}}}\longrightarrow {{\mbox{PFC E rule2}}})$ represents the average (unsigned) connection strength from the PFC PV neurons that prefer rule 1 to PFC excitatory neurons that prefer rule 2. Here rule 1 refers to color rule and rule 2 refers to shape rule.

The other connectivity biases were defined analogously.

$${{{{{\rm{CB}}}}}}(\,{{\mbox{PFC E}}}\longrightarrow {{\mbox{PFC E}}})= \bar{W}(\,{{\mbox{PFC E rule1}}}\longrightarrow {{\mbox{PFC E rule1}}})\\ +\bar{W}(\,{{\mbox{PFC E rule2}}}\longrightarrow {{\mbox{PFC E rule2}}})\\ - \bar{W}(\,{{\mbox{PFC E rule1}}}\longrightarrow {{\mbox{PFC E rule2}}})\\ - \bar{W}(\,{{\mbox{PFC E rule2}}}\longrightarrow {{\mbox{PFC E rule1}}})$$

(21)

$${{{{{\rm{CB}}}}}}(\,{{\mbox{PFC E}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}(\,{{\mbox{PFC E rule1}}}\longrightarrow {{\mbox{PFC PV rule1}}})\\ +\bar{W}(\,{{\mbox{PFC E rule2}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2) \\ - \bar{W}(\,{{\mbox{PFC E rule1}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ - \bar{W}(\,{{\mbox{PFC E rule2}}}\longrightarrow {{\mbox{PFC PV rule1}}})$$

(22)

$${{{{{\rm{CB}}}}}}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}(\,{{\mbox{PFC PV rule1}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ +\bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC PV rule1}}})\\ - \bar{W}(\,{{\mbox{PFC PV rule1}}}\longrightarrow {{\mbox{PFC PV rule1}}})\\ - \bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2)$$

(23)

$${{{{{\rm{CB}}}}}}(\,{{\mbox{PFC E}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}})= \bar{W}(\,{{\mbox{PFC E rule1}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ +\bar{W}(\,{{\mbox{PFC E rule2}}}\longrightarrow {{\mbox{PFC error x rule1}}})\\ - \bar{W}(\,{{\mbox{PFC E rule1}}}\longrightarrow {{\mbox{PFC error x rule1}}})\\ - \bar{W}(\,{{\mbox{PFC E rule2}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2)$$

(24)

$${{{{{\rm{CB}}}}}}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}})= \bar{W}(\,{{\mbox{PFC PV rule1}}}\longrightarrow {{\mbox{PFC error x rule1}}})\\ +\bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ - \bar{W}(\,{{\mbox{PFC PV rule1}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ - \bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC error x rule1}}})$$

(25)

$${{{{{\rm{CB}}}}}}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}\longrightarrow \,{{\mbox{PFC E}}})= \bar{W}(\,{{\mbox{PFC error x rule1}}}\longrightarrow {{\mbox{PFC E rule1}}})\\ +\bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC E rule2}}})\\ - \bar{W}(\,{{\mbox{PFC error x rule1}}}\longrightarrow {{\mbox{PFC E rule2}}})\\ - \bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC E rule1}}})$$

(26)

$${{{{{\rm{CB}}}}}}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}(\,{{\mbox{PFC error x rule1}}}\longrightarrow {{\mbox{PFC PV rule1}}})\\ +\bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ - \bar{W}(\,{{\mbox{PFC error x rule1}}}\,\longrightarrow {{{{{\rm{PFC}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}2)\\ - \bar{W}({{{{{\rm{PFC}}}}}}\,{{{{{\rm{error}}}}}}\,{{{{{\rm{x}}}}}}\,{{{{{\rm{rule}}}}}}2\longrightarrow \,{{\mbox{PFC PV rule1}}})$$

(27)

The connectivity biases between SST neurons and excitatory neurons in the PFC module (Supplementary Fig. 5a–d) was defined similarly as those for PV neurons.

The connectivity biases between the different response location-selective populations in the sensorimotor module (SM) are defined as

$${{{{{\rm{CB}}}}}}(\,{{\mbox{SM response E}}}\longrightarrow {{\mbox{SM response E}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1)\\ \; +\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2) \\ \; +\bar{W}(\,{{\mbox{SM E response 3}}}\longrightarrow {{\mbox{SM E response 3}}}) \\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow \,{{\mbox{SM E response 2 and 3}}}) \\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,3)\\ \; - \bar{W}(\,{{\mbox{SM E response 3}}}\,\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,2).$$

(28)

In the last equation, for example, $\bar{W}(\,{{\mbox{SM response 1}}}\longrightarrow {{\mbox{SM response 2 and 3}}})$ represents the mean connection strength from excitatory neurons in the sensorimotor module that prefer response location 1 to those that prefer response locations 2 and 3.

The other connectivity biases were defined similarly

$${{{{{\rm{CB}}}}}}(\,{{\mbox{SM response E}}}\, \longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{response}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1) \\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2)\\ \;+\bar{W}(\,{{\mbox{SM E response 3}}}\,\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2\,{{{{{\rm{and}}}}}}\,3)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,3)\\ \; - \bar{W}(\,{{\mbox{SM E response 3}}}\,\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,2).$$

(29)

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{response}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow \,{{\mbox{SM response E}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow \,{{\mbox{SM E response 2 and 3}}})\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,3)\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,2)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,1)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2)\\ \; - \bar{W}(\,{{\mbox{SM E response 3}}}\,\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3).$$

(30)

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{response}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{response}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2\,{{{{{\rm{and}}}}}}\,3)\\ \; +\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,3)\\ \; +\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\,{{{{{\rm{and}}}}}}\,2)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,1)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,2)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{response}}}}}}\,3).$$

(31)

The connectivity biases between the different rule-selective populations in the sensorimotor module are defined as

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{E}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{E}}}}}}) \;= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1).$$

(32)

The other connectivity biases were defined similarly

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{E}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{PV}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1).$$

(33)

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{E}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{rule}}}}}}\,2).$$

(34)

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{rule}}}}}}\,{{{{{\rm{PV}}}}}}) \;= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2)\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,1)\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{rule}}}}}}\,2).$$

(35)

The connectivity biases between the different shared feature-selective populations in the sensorimotor module are defined similarly. For the populations selective for the two colors

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{share}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{E}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{E}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}})\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}})\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}})\\ \; -\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}),$$

(36)

where for example $\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}})$ is the average connection strength within the neural population selective for the shared feature blue.

The other connectivity biases were defined similarly

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{E}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{PV}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{blue}}}}}})\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{red}}}}}})\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{red}}}}}})\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{blue}}}}}}).$$

(37)

$${{{{{\rm{CB}}}}}}({{{{{\rm{SM}}}}}}\,{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{PV}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{shared}}}}}}\,{{{{{\rm{feature}}}}}}\,({{{{{\rm{color}}}}}})\,{{{{{\rm{E}}}}}})= \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}})\\ \;+\bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}})\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{blue}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{blue}}}}}})\\ \; - \bar{W}({{{{{\rm{SM}}}}}}\,{{{{{\rm{PV}}}}}}\,{{{{{\rm{red}}}}}}\longrightarrow {{{{{\rm{SM}}}}}}\,{{{{{\rm{E}}}}}}\,{{{{{\rm{red}}}}}}).$$

(38)

$${{{{\rm{CB}}}}}({{{\rm{SM}}}}\,{{{\rm{shared}}}}\,{{{\rm{feature}}}}\, ({{{\rm{color}}}})\,{{{\rm{PV}}}} \longrightarrow {{{\rm{SM}}}}\,{{{\rm{shared}}}}\,{{{\rm{feature}}}}\, ({{{\rm{color}}}})\,{{{\rm{PV}}}})= {\bar{W}}({{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{blue}}}}\longrightarrow{{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{red}}}}) \\ \;+{\bar{W}}({{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{red}}}}\longrightarrow{{{\rm{SM}}}}\, {{{\rm{PV}}}}\,{{{\rm{blue}}}}) \\ \; - {\bar{W}}({{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{blue}}}}\longrightarrow{{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{blue}}}}) \\ \; - {\bar{W}}({{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{red}}}}\longrightarrow{{{\rm{SM}}}}\,{{{\rm{PV}}}}\,{{{\rm{red}}}}).$$

(39)

The connectivity biases between populations selective for different shared shapes were defined analogously by substituting blue and red with circle and triangle.

Simulation of the optogenetic inhibition

Optogenetic inhibition was simulated by clamping the activity of neurons at 0 throughout the entire trial and the inter-trial interval.

Principal angle between subspaces

The principal angle between two subspaces is a generalization of angle between lines and planes in Euclidean space to arbitrary dimensions⁹⁸. It can be computed by iteratively finding pairs of unit length “principal vectors”, one from each subspace, that have the greatest inner product, subject to the condition that the principal vectors are orthogonal to all previous principal vectors⁹⁹.

In computing the principal angles between different rule-selective and response-selective subspaces, we first determined the dimensionality of the subspaces using the participation ratio¹⁰⁰. Then the principal angles were computed using the “subspace_angles” function from the Python package Scipy. The largest principal angle was used.

To obtain a shuffled distribution, we first randomly and evenly split all trials belonging to a particular rule or response into two halves. Then, we generated two subspaces from neural trajectories during the two group of trials. A principal angle between these two subspaces was then computed for each rule/response. The angles were then averaged across all rules/responses to obtain a principal angle from shuffled data. This process was repeated 100 times to generate a distribution of principal angles from shuffled data.

Assessing the strength of non-linear mixed selectivity

The extent to which neurons in the sensorimotor module encode the conjunction of stimulus and rule in a non-linear fashion was evaluated using the coefficient of determination of a linear regression model. To tease apart non-linear and linear mixed selectivity, we first fitted the mean activity of each neuron during response period using a set of regressors that represent either the rule or the stimulus alone:

$${{{{{\rm{FR}}}}}}(n,{{{{{\rm{tr}}}}}})={\sum}_{s}{\beta }_{n,s}{\mathbb{1}}({{{{{\rm{stim}}}}}}({{{{{\rm{tr}}}}}})=s)+{\sum}_{r}{\beta }_{n,r}{\mathbb{1}}({{{{{\rm{rule}}}}}}({{{{{\rm{tr}}}}}})=r),$$

(40)

where ${{{{{\rm{FR}}}}}}(n,{{{{{\rm{tr}}}}}})$ is the mean firing rate of neuron n during the response epoch of trial ${{{{{\rm{tr}}}}}}$. ${\mathbb{1}}$ is the indicator function. For example, ${\mathbb{1}}({{{{{\rm{stim}}}}}}({{{{{\rm{tr}}}}}})=s)=1$ if the stimulus during trial ${{{{{\rm{tr}}}}}}$ is s, and it is 0 otherwise.

Then, another linear regression model was fitted on the residual activity unexplained by the linear regression model above, using the conjunction of rule and stimulus as regressors:

$$\tilde{{{{{{\rm{FR}}}}}}}(n,{{{{{\rm{tr}}}}}})={\sum}_{s,r}{\beta }_{n,s,r}{\mathbb{1}}({{{{{\rm{stim}}}}}}({{{{{\rm{tr}}}}}})=s,{{{{{\rm{rule}}}}}}({{{{{\rm{tr}}}}}})=r),$$

(41)

where $\tilde{{{{{{\rm{FR}}}}}}}(n,{{{{{\rm{tr}}}}}})$ is the firing rate of neuron n during trial ${{{{{\rm{tr}}}}}}$ subtracted by the predicted firing rate from the model defined by Equation (40). The R² value of this regression model was used to represent the strength of non-linear mixed selectivity.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All trained networks and pre-processed data used in this study in this study have been deposited at https://drive.google.com/drive/folders/17LqvmcBynX0a4OtUgnkODGJKK_JFRB2Q?usp=sharing Source data are provided with this paper.

Code availability

All computer code used to generate the results in this manuscript has been uploaded to https://github.com/liuyuue/BioRNN_WCST, archived in https://doi.org/10.5281/zenodo.10183167¹⁰¹.

References

Grant, D. A. & Berg, E. A behavioral analysis of degree of reinforcement and ease of shifting to new responses in a weigl-type card-sorting problem. J. Exp. Psychol. 38, 404 (1948).
Article CAS PubMed Google Scholar
Mante, V., Sussillo, D., Shenoy, K. V. & Newsome, W. T. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503, 78–84 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Rikhye, R. V., Gilra, A. & Halassa, M. M. Thalamic regulation of switching between cortical representations enables cognitive flexibility. Nat. Neurosci. 21, 1753–1763 (2018).
Article CAS PubMed PubMed Central Google Scholar
Milner, B. Effects of different brain lesions on card sorting: The role of the frontal lobes. Arch. Neurol. 9, 90–100 (1963).
Article Google Scholar
Dias, R., Robbins, T. & Roberts, A. Primate analogue of the wisconsin card sorting test: effects of excitotoxic lesions of the prefrontal cortex in the marmoset. Behav. Neurosci. 110, 872 (1996).
Article CAS PubMed Google Scholar
Passingham, R. Non-reversal shifts after selective prefrontal ablations in monkeys (macaca mulatta). Neuropsychologia 10, 41–46 (1972).
Article CAS PubMed Google Scholar
Sakai, K. Task set and prefrontal cortex. Annu. Rev. Neurosci. 31, 219–245 (2008).
Article CAS PubMed Google Scholar
Buckley, M. J. et al. Dissociable components of rule-guided behavior depend on distinct medial and prefrontal regions. Science 325, 52–58 (2009).
Article ADS CAS PubMed Google Scholar
Miller, E. K. & Cohen, J. D. An integrative theory of prefrontal cortex function. Annu. Rev. Neurosci. 24, 167–202 (2001).
Article CAS PubMed Google Scholar
Mansouri, F. A., Matsumoto, K. & Tanaka, K. Prefrontal cell activities related to monkeys’ success and failure in adapting to rule changes in a wisconsin card sorting test analog. J. Neurosci. 26, 2745–2756 (2006).
Article CAS PubMed PubMed Central Google Scholar
Kamigaki, T., Fukushima, T. & Miyashita, Y. Cognitive set reconfiguration signaled by macaque posterior parietal neurons. Neuron 61, 941–951 (2009).
Article CAS PubMed Google Scholar
Sarafyazd, M. & Jazayeri, M. Hierarchical reasoning by neural circuits in the frontal cortex. Science 364, eaav8911 (2019).
Article CAS PubMed Google Scholar
Ito, T., Yang, G. R., Laurent, P., Schultz, D. H. & Cole, M. W. Constructing neural network models from brain data reveals representational transformations linked to adaptive behavior. Nat. Commun. 13, 673 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Delevich, K., Tucciarone, J., Huang, Z. J. & Li, B. The mediodorsal thalamus drives feedforward inhibition in the anterior cingulate cortex via parvalbumin interneurons. J. Neurosci. 35, 5743–5753 (2015).
Article CAS PubMed PubMed Central Google Scholar
Pi, H.-J. et al. Cortical interneurons that specialize in disinhibitory control. Nature 503, 521–524 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, S. et al. Long-range and local circuits for top-down modulation of visual cortex processing. Science 345, 660–665 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Muñoz, W., Tremblay, R., Levenstein, D. & Rudy, B. Layer-specific modulation of neocortical dendritic inhibition during active wakefulness. Science 355, 954–959 (2017).
Article ADS PubMed Google Scholar
Keller, A. J. et al. A disinhibitory circuit for contextual modulation in primary visual cortex. Neuron 108, 1181–1193 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. J., Tegnér, J., Constantinidis, C. & Goldman-Rakic, P. S. Division of labor among distinct subtypes of inhibitory neurons in a cortical microcircuit of working memory. Proc. Natl Acad. Sci., USA 101, 1368–73 (2004).
Article ADS CAS PubMed PubMed Central Google Scholar
Kepecs, A. & Fishell, G. Interneuron cell types are fit to function. Nature 505, 318–326 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Tremblay, R., Lee, S. & Rudy, B. GABAergic interneurons in the neocortex: From cellular properties to circuits. Neuron 91, 260–292 (2016).
Article CAS PubMed PubMed Central Google Scholar
Yang, G. R., Murray, J. D. & Wang, X.-J. A dendritic disinhibitory circuit mechanism for pathway-specific gating. Nat. Commun. 7, 1–14 (2016).
Article ADS Google Scholar
Nakahara, K., Hayashi, T., Konishi, S. & Miyashita, Y. Functional mri of macaque monkeys performing a cognitive set-shifting task. Science 295, 1532–1536 (2002).
Article ADS CAS PubMed Google Scholar
Yang, G. R. & Wang, X.-J. Artificial neural networks for neuroscientists: a primer. Neuron 107, 1048–1070 (2020).
Article CAS PubMed Google Scholar
Jadi, M., Polsky, A., Schiller, J. & Mel, B. W. Location-dependent effects of inhibition on local spiking in pyramidal neuron dendrites. PLoS Comput. Biol. 8, e1002550 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T. & Wang, X.-J. Task representations in neural networks trained to perform many cognitive tasks. Nat. Neurosci. 22, 297–306 (2019).
Article CAS PubMed Google Scholar
Song, H. F., Yang, G. R. & Wang, X.-J. Training excitatory-inhibitory recurrent neural networks for cognitive tasks: a simple and flexible framework. PLoS Comput. Biol. 12, e1004792 (2016).
Article ADS PubMed PubMed Central Google Scholar
Fuster, J. M. Unit activity in prefrontal cortex during delayed-response performance: neuronal correlates of transient memory. J. Neurophysiol. 36, 61–78 (1973).
Article CAS PubMed Google Scholar
Funahashi, S., Bruce, C. J. & Goldman-Rakic, P. S. Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. J. Neurophysiol. 61, 331–349 (1989).
Article CAS PubMed Google Scholar
Goldman-Rakic, P. Cellular basis of working memory. Neuron 14, 477–85 (1995).
Article CAS PubMed Google Scholar
Romo, R., Brody, C. D., Hernández, A. & Lemus, L. Neuronal correlates of parametric working memory in the prefrontal cortex. Nature 399, 470–473 (1999).
Article ADS CAS PubMed Google Scholar
Christophel, T. B., Klink, P. C., Spitzer, B., Roelfsema, P. R. & Haynes, J.-D. The distributed nature of working memory. Trends Cogn. Sci. 21, 111–124 (2017).
Article PubMed Google Scholar
Leavitt, M. L., Mendoza-Halliday, D. & Martinez-Trujillo, J. C. Sustained activity encoding working memories: not fully distributed. Trends Neurosci. 40, 328–346 (2017).
Article CAS PubMed Google Scholar
Guo, Z. V. et al. Maintenance of persistent activity in a frontal thalamocortical loop. Nature 545, 181–186 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Sreenivasan, K. K. & D’Esposito, M. The what, where and how of delay activity. Nat. Rev. Neurosci. 20, 466–481 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wong, K.-F. & Wang, X.-J. A recurrent network mechanism of time integration in perceptual decisions. J. Neurosci. 26, 1314–1328 (2006).
Article CAS PubMed PubMed Central Google Scholar
Najafi, F. et al. Excitatory and inhibitory subnetworks are equally selective during decision-making and emerge simultaneously during learning. Neuron 105, 165–179 (2020).
Article CAS PubMed Google Scholar
Roach, J. P., Churchland, A. K. & Engel, T. A. Choice selective inhibition drives stability and competition in decision circuits. Nat. Commun. 14, 147 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Jia, H., Rochefort, N. L., Chen, X. & Konnerth, A. Dendritic organization of sensory input to cortical neurons in vivo. Nature 464, 1307–1312 (2010).
Article ADS CAS PubMed Google Scholar
Cichon, J. & Gan, W.-B. Branch-specific dendritic ca2+ spikes cause persistent synaptic plasticity. Nature 520, 180–185 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Rashid, S. K. et al. The dendritic spatial code: branch-specific place tuning and its experience-dependent decoupling. BioRxiv. https://doi.org/10.1101/2020.01.24.916643 (2020).
Voigts, J. & Harnett, M. T. Somatic and dendritic encoding of spatial variables in retrosplenial cortex differs during 2d navigation. Neuron 105, 237–245 (2020).
Article CAS PubMed Google Scholar
Rigotti, M. et al. The importance of mixed selectivity in complex cognitive tasks. Nature 497, 585–90 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Kikumoto, A. & Mayr, U. Conjunctive representations that integrate stimuli, responses, and rules are critical for action selection. Proc. Natl Acad. Sci. 117, 10603–10608 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Kikumoto, A., Mayr, U. & Badre, D. The role of conjunctive representations in prioritizing and selecting planned actions. Elife 11, e80153 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wallis, J. D., Anderson, K. C. & Miller, E. K. Single neurons in prefrontal cortex encode abstract rules. Nature 411, 953–956 (2001).
Article ADS CAS PubMed Google Scholar
Botvinick, M. M., Cohen, J. D. & Carter, C. S. Conflict monitoring and anterior cingulate cortex: an update. Trends Cogn. Sci. 8, 539–546 (2004).
Article PubMed Google Scholar
Quilodran, R., Rothe, M. & Procyk, E. Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron 57, 314–325 (2008).
Article CAS PubMed Google Scholar
Kolling, N. et al. Value, search, persistence and model updating in anterior cingulate cortex. Nat. Neurosci. 19, 1280–1285 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mansouri, F. A., Freedman, D. J. & Buckley, M. J. Emergence of abstract rules in the primate brain. Nat. Rev. Neurosci. 21, 595–610 (2020).
Article CAS PubMed Google Scholar
Spellman, T., Svei, M., Kaminsky, J., Manzano-Nieves, G. & Liston, C. Prefrontal deep projection neurons enable cognitive flexibility via persistent feedback monitoring. Cell 184, 2750–2766 (2021).
Article CAS PubMed Google Scholar
Salzman, C. D. & Fusi, S. Emotion, cognition, and mental state representation in amygdala and prefrontal cortex. Annu. Rev. Neurosci. 33, 173–202 (2010).
Article CAS PubMed PubMed Central Google Scholar
Holroyd, C. B. & Coles, M. G. The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol. Rev. 109, 679 (2002).
Article PubMed Google Scholar
Matsumoto, M. & Hikosaka, O. Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Andersen, R. A. & Cui, H. Intention, action planning, and decision making in parietal-frontal circuits. Neuron 63, 568–583 (2009).
Article CAS PubMed Google Scholar
Balleine, B. W. & O’doherty, J. P. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69 (2010).
Article PubMed Google Scholar
Conway, B. R. Color vision, cones, and color-coding in the cortex. Neuroscientist 15, 274–290 (2009).
Article PubMed Google Scholar
Lafer-Sousa, R. & Conway, B. R. Parallel, multi-stage processing of colors, faces and shapes in macaque inferior temporal cortex. Nat. Neurosci. 16, 1870–1878 (2013).
Article CAS PubMed PubMed Central Google Scholar
Chang, L., Bao, P. & Tsao, D. Y. The representation of colored objects in macaque color patches. Nat. Commun. 8, 2064 (2017).
Article ADS PubMed PubMed Central Google Scholar
Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273 (2019).
Article CAS PubMed PubMed Central Google Scholar
Tseng, S.-Y., Chettih, S. N., Arlt, C., Barroso-Luque, R. & Harvey, C. D. Shared and specialized coding across posterior cortical areas for dynamic navigation decisions. Neuron 110, 2484–2502 (2022).
Article CAS PubMed PubMed Central Google Scholar
Findling, C. et al. Brain-wide representations of prior information in mouse decision-making. BioRxiv. https://doi.org/10.1101/2023.07.04.547684 (2023).
Lab, I. B. et al. A brain-wide map of neural activity during complex behaviour. bioRxiv. https://doi.org/10.1101/2023.07.04.547681 (2023).
Froudist-Walsh, S. et al. A dopamine gradient controls access to distributed working memory in the large-scale monkey cortex. Neuron 109, 3500–3520 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mejías, J. F. & Wang, X.-J. Mechanisms of distributed working memory in a large-scale network of macaque neocortex. Elife 11, e72136 (2022).
Article PubMed PubMed Central Google Scholar
Rigotti, M., Rubin, D. B. D., Wang, X.-J. & Fusi, S. Internal representation of task rules by recurrent dynamics: the importance of the diversity of neural responses. Front. Comput. Neurosci. 4, 24 (2010).
Article PubMed PubMed Central Google Scholar
Dehaene, S. & Changeux, J.-P. The wisconsin card sorting test: Theoretical analysis and modeling in a neuronal network. Cereb. Cortex 1, 62–79 (1991).
Article CAS PubMed Google Scholar
Turner-Evans, D. et al. Angular velocity integration in a fly heading circuit. Elife 6, e23496 (2017).
Article PubMed PubMed Central Google Scholar
Hayden, B. Y., Pearson, J. M. & Platt, M. L. Neuronal basis of sequential foraging decisions in a patchy environment. Nat. Neurosci. 14, 933–939 (2011).
Article CAS PubMed PubMed Central Google Scholar
Semedo, J. D., Zandvakili, A., Machens, C. K., Byron, M. Y. & Kohn, A. Cortical areas interact through a communication subspace. Neuron 102, 249–259 (2019).
Article CAS PubMed PubMed Central Google Scholar
Langdon, C., Genkin, M. & Engel, T. A. A unifying perspective on neural manifolds and circuits for cognition. Nat. Rev. Neurosci. 24, 363–377 (2023).
Article CAS PubMed PubMed Central Google Scholar
Lee, S.-H. et al. Activation of specific interneurons improves v1 feature selectivity and visual perception. Nature 488, 379–383 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, S.-H., Kwan, A. C. & Dan, Y. Interneuron subtypes and orientation tuning. Nature 508, E1–E2 (2014).
Article CAS PubMed Google Scholar
Wilson, N. R., Runyan, C. A., Wang, F. L. & Sur, M. Division and subtraction by distinct cortical inhibitory networks in vivo. Nature 488, 343–348 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Fu, Y. et al. A cortical circuit for gain control by behavioral state. Cell 156, 1139–1152 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pfeffer, C. K., Xue, M., He, M., Huang, Z. J. & Scanziani, M. Inhibition of inhibition in visual cortex: the logic of connections between molecularly distinct interneurons. Nat. Neurosci. 16, 1068–1076 (2013).
Article CAS PubMed PubMed Central Google Scholar
Loomba, S. et al. Connectomic comparison of mouse and human cortex. Science 377, eabo0924 (2022).
Article CAS PubMed Google Scholar
Kamigaki, T., Fukushima, T. & Miyashita, Y. Neuronal signal dynamics during preparation and execution for behavioral shifting in macaque posterior parietal cortex. J. Cogn. Neurosci. 23, 2503–2520 (2011).
Article PubMed Google Scholar
Purcell, B. A. & Kiani, R. Hierarchical decision processes that operate over distinct timescales underlie choice and changes in strategy. Proc. Natl Acad. Sci. 113, E4531–E4540 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Johnston, K., Levin, H. M., Koval, M. J. & Everling, S. Top-down control-signal dynamics in anterior cingulate and prefrontal cortex neurons following task switching. Neuron 53, 453–462 (2007).
Article CAS PubMed Google Scholar
Tsutsui, K.-I., Hosokawa, T., Yamada, M. & Iijima, T. Representation of functional category in the monkey prefrontal cortex and its rule-dependent use for behavioral selection. J. Neurosci. 36, 3038–3048 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bernardi, S. et al. The geometry of abstraction in the hippocampus and prefrontal cortex. Cell 183, 954–967 (2020).
Article CAS PubMed PubMed Central Google Scholar
Goudar, V. et al. A comparison of rapid rule-learning strategies in humans and monkeys. Journal of Neuroscience, 44, e0231232024 (2024).
Kennerley, S. W., Walton, M. E., Behrens, T. E., Buckley, M. J. & Rushworth, M. F. Optimal decision making and the anterior cingulate cortex. Nat. Neurosci. 9, 940–947 (2006).
Article CAS PubMed Google Scholar
Xue, C., Kramer, L. E. & Cohen, M. R. Dynamic task-belief is an integral part of decision-making. Neuron 110, 2503–2511 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ben-Artzi, I., Kessler, Y., Nicenboim, B. & Shahar, N. Computational mechanisms underlying latent value updating of unchosen actions. Sci. Adv. 9, eadi2704 (2023).
Article PubMed PubMed Central Google Scholar
Shenhav, A., Botvinick, M. M. & Cohen, J. D. The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013).
Article CAS PubMed PubMed Central Google Scholar
Fusi, S., Asaad, W. F., Miller, E. K. & Wang, X.-J. A neural circuit model of flexible sensorimotor mapping: learning and forgetting on multiple timescales. Neuron 54, 319–333 (2007).
Article CAS PubMed PubMed Central Google Scholar
Anastasiades, P. G., Collins, D. P. & Carter, A. G. Mediodorsal and ventromedial thalamus engage distinct l1 circuits in the prefrontal cortex. Neuron 109, 314–330 (2021).
Article CAS PubMed Google Scholar
Mukherjee, A., Lam, N. H., Wimmer, R. D. & Halassa, M. M. Thalamic circuits for independent control of prefrontal signal and noise. Nature 600, 100–104 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Garcia-Junco-Clemente, P. et al. An inhibitory pull–push circuit in frontal cortex. Nat. Neurosci. 20, 389–392 (2017).
Article CAS PubMed PubMed Central Google Scholar
Su, Z. & Cohen, J. Y. Two types of locus coeruleus norepinephrine neurons drive reinforcement learning. BioRxiv. https://doi.org/10.1101/2022.12.08.519670 (2022).
Cho, J. R. et al. Dorsal raphe dopamine neurons modulate arousal and promote wakefulness by salient stimuli. Neuron 94, 1205–1219 (2017).
Article CAS PubMed Google Scholar
Jiang, X. et al. Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350, aac9462 (2015).
Article PubMed PubMed Central Google Scholar
Werbos, P. J. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990).
Article Google Scholar
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
Jordan, C. Essai sur la géométrie à n dimensions. Bull. de. la Soci.été math.ématique de. Fr. 3, 103–174 (1875).
Google Scholar
Björck, k & Golub, G. H. Numerical methods for computing angles between linear subspaces. Math. Comput. 27, 579–594 (1973).
Article MathSciNet Google Scholar
Gao, P. et al. A theory of multineuronal dimensionality, dynamics and measurement. BioRxiv. https://doi.org/10.1101/214262 (2017).
Liu, Y. & Wang, X.-J. Flexible gating between subspaces in a neural network model of internally guided task switching https://doi.org/10.5281/zenodo.10183167 (2024).

Download references

Acknowledgements

This work was supported by James Simons Foundation Grant 543057SPI (X.-J.W.), the National Institutes of Health grant R01MH062349 (X.-J.W.), and the ONR grant N00014-23-1-2040 (X.-J.W.). We thank (alphabetically) Aldo Battista, Mark Buckley, Sage Chen, Shuo Chen, Vishwa Goudar, Kenneth Kay, Haohong Li’s lab, Jianguang Ni’s lab, Yu Qi’s lab, Bo Shen, Yi Sun’s lab, Lucas Tian, Xiaohan Zhang and all members of X.-J.W.’s lab for helpful discussions and comments on the manuscript.

Author information

Authors and Affiliations

Center for Neural Science, New York University, New York, NY, 10003, USA
Yue Liu & Xiao-Jing Wang

Authors

Yue Liu
View author publications
Search author on:PubMed Google Scholar
Xiao-Jing Wang
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.L. and X.-J.W. designed the research, discussed regularly throughout the project and wrote the manuscript. Y.L. performed the research.

Corresponding author

Correspondence to Xiao-Jing Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Tomoki Fukai and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, Y., Wang, XJ. Flexible gating between subspaces in a neural network model of internally guided task switching. Nat Commun 15, 6497 (2024). https://doi.org/10.1038/s41467-024-50501-y

Download citation

Received: 12 January 2024
Accepted: 10 July 2024
Published: 01 August 2024
DOI: https://doi.org/10.1038/s41467-024-50501-y

This article is cited by

Neural correlates of communication modes in medical students using fMRI
- Raluca Corina Oprea
- Frederic Andersson
- Wissam El-Hage
Brain Imaging and Behavior (2025)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Training modular recurrent neural networks with different types of inhibitory neurons

Two rule attractor states in the PFC module maintained by interactions between modules

Two emergent subpopulations of excitatory neurons in the PFC module

Maintaining and switching rule states via structured connectivity patterns between subpopulations of neurons within the PFC module

Top-down propagation of the rule information through structured long-range connections

Structured input and output connections of the sensorimotor module enable rule-dependent action selection

Recurrent connectivity and dynamics within the sensorimotor module

SST neurons are essential to dendritic top-down gating

Discussion

Mapping between model components and brain regions

Attractor states supported by inter-areal connections

Circuit mechanism in the frontal-parietal network for rule maintenance and update

Connecting subspace to circuits

Cued versus un-cued rule switching

Dynamics of behavior during un-cued rule switching

Other limitations of the work

Methods

Model setup

Variations in the model hyperparameters

Dendritic nonlinearities

Initializations

Sparsity of the SST → dendrite connectivity in the sensorimotor module

Random seeds

Task

Representation of inputs and outputs

Training method

Single neuron selectivity metric

Classification criteria for different neuronal populations

Connectivity bias

Simulation of the optogenetic inhibition

Principal angle between subspaces

Assessing the strength of non-linear mixed selectivity

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links