Specialized structure of neural population codes in parietal cortex outputs

Safaai, Houman; Wang, Alice Y.; Kira, Shinichiro; Blanco Malerba, Simone; Panzeri, Stefano; Harvey, Christopher D.

doi:10.1038/s41593-025-02095-x

Download PDF

Article
Open access
Published: 31 October 2025

Specialized structure of neural population codes in parietal cortex outputs

Nature Neuroscience volume 28, pages 2550–2560 (2025)Cite this article

26k Accesses
2 Citations
62 Altmetric
Metrics details

Subjects

Abstract

Cortical neurons projecting to the same target area may form specialized population codes to transmit information, but whether and how they do so remains unclear. We used calcium imaging in mouse posterior parietal cortex, retrograde labeling and statistical multivariate models to address this question during a delayed match-to-sample task in virtual reality. We found that neurons projecting to the same area have elevated pairwise activity correlations. These correlations are structured as information-limiting and information-enhancing motifs that shape interaction networks and collectively enhance information about the mouse’s choice beyond what is contributed by pairwise interactions. This network structure is unique to subpopulations that project to the same target and was not observed in surrounding neural populations with unidentified projections. Furthermore, this structure is only present when mice make correct, but not incorrect, behavioral choices. Therefore, cortical neurons comprising an output pathway form a population code with a unique correlation structure that enhances population-level information to guide accurate behavior.

Correlations enhance the behavioral readout of neural population activity in association cortex

Article 13 May 2021

The structures and functions of correlations in neural population codes

Article 22 June 2022

Brain-wide representations of prior information in mouse decision-making

Article Open access 03 September 2025

Main

A fundamental component of neural computation is how populations of neurons encode and transmit information to downstream brain areas^1,2. Each cortical area communicates with many areas and contains a heterogeneous population of neurons that project to distinct downstream targets^3,4,5,6. For the transmission of information between brain areas, the relevant neural codes are likely formed by populations of neurons that communicate with the same downstream target area, allowing their activity to be read out collectively. However, in most studies of neural population codes, populations have been analyzed without the knowledge of whether the cells project to the same target. It is, therefore, an open question of what principles underlie coding in populations of neurons that project to the same target area.

The information encoded in a population of neurons is affected by correlations between the activity of different neurons. Experimental and theoretical works have demonstrated how the correlations in activity between pairs of neurons can either enhance the population’s information, due to synergistic neuron–neuron correlations, or increase redundancy between neurons, which establishes robust transmission but limits the information encoded⁷. Most of this understanding arises from considerations of typical or average pairwise correlation values in populations. However, recent work reports that pairwise correlations in large populations can take on additional network structures, such as hubs of redundant or synergistic interactions^8,9,10. Also, theoretical studies propose that a network-level structure of pairwise correlations may enhance neural population information¹¹. Notably, whether projection pathways have network-level structures that enhance the information encoded in the projection pathway or aid their transmission to other brain areas has not been studied.

We studied the population codes in projection pathways between cortical areas in the posterior parietal cortex (PPC). PPC is a sensory-motor interface involved in decision-making tasks, including those involved in navigation^{12,13,14,15,16,17,18}. PPC has heterogeneous activity profiles, including cells encoding various sensory modalities, locomotor movements and cognitive signals, such as spatial and choice information^{12,19,20,21,22}. PPC is densely interconnected with cortical and subcortical regions in a network containing retrosplenial cortex (RSC) and anterior cingulate cortex (ACC)²³. In addition, population codes in PPC contain correlations between neurons that benefit behavior^24,25,26. Here we study PPC in a flexible navigation-based decision-making task because navigation decisions require the coordination of multiple brain areas to integrate signals across areas and also because PPC activity is necessary for mice to solve navigation decision tasks^12,27,28,29.

We developed statistical multivariate modeling methods to investigate the population codes in cells sending axonal projections to the same target. We discovered that, in PPC neurons projecting to the same target, pairwise correlations are stronger and arranged into a specialized network structure of interactions. This structure consists of pools of neurons with enriched within-pool and reduced across-pool information-enhancing (IE) interactions, with respect to a randomly structured network. This structure enhances the amount of information about the mouse’s choice encoded by the population, with proportionally larger contributions for larger population sizes. Remarkably, this IE structure is only present in populations of cells projecting to the same target, and not in neighboring populations with unidentified outputs. Such structure is present when mice make correct choices, but not when they make incorrect choices. We propose that specialized network structures in PPC populations, which comprise an output pathway, enhance signal propagation in a manner that may facilitate accurate decision-making.

Results

A delayed match-to-sample task that isolates components of flexible navigation decisions

We developed a delayed match-to-sample task using navigation in a virtual reality T-maze (Fig. 1a)²⁷. The T-stem contained a black or white sample cue followed by a delay maze segment with identical visual patterns on every trial. When mice passed a specific location, a test cue was revealed as a white tower in the left T-arm and a black tower in the right T-arm, or vice versa. The sample cue and test cue were chosen randomly and independently in each trial, and the two types of each cue defined four trial types. Mice received rewards when they turned into the T-arm whose color matched the sample cue. Thus, mice combined a memory of the sample cue with the test cue identity to choose a turn direction at the T-intersection. After training, mice performed this task with approximately 80% accuracy (Extended Data Fig. 1a,b). Incorrect trials occurred interleaved with correct trials at a relatively constant rate throughout the session, suggesting that errors were due to inaccurate decision-making rather than disengagement (Extended Data Fig. 1c).

**Fig. 1: Differences in the activity of neurons projecting to distinct cortical targets.**

We used two-photon calcium imaging to measure the activity of hundreds of neurons simultaneously in layer 2/3 of PPC. We injected retrograde tracers conjugated to fluorescent dyes of different colors to identify neurons with axonal projections to ACC, RSC and contralateral PPC (Fig. 1b and Extended Data Fig. 1d,g). These areas are major recipients of projections from layer 2/3 PPC neurons⁴, and the ACC–RSC–PPC network is important for navigation-based decision tasks^29,30. The PPC neurons projecting to ACC, RSC and contralateral PPC were intermingled, except that ACC-projecting neurons were enriched in superficial layer 2/3 (Fig. 1c). Neurons projecting to the same area were slightly closer in anatomical space than in unlabeled neurons (Extended Data Fig. 1f). Neurons labeled with multiple retrograde tracers were not observed.

Neurons were transiently active during task trials with different neurons active at different times, and the activity of the population tiled the trial¹² (Fig. 1d and Extended Data Fig. 1h). ACC-projecting cells had higher activity early in the trial, while RSC-projecting cells had higher activity later. Contralateral PPC-projecting neurons had more uniform activity across the trial (Fig. 1e,f). These differences in activity levels across the trial suggest that neurons projecting to different targets could contribute to different stages of information processing (Extended Data Fig. 3a). They could encode the sample cue (neuron 1; Fig. 1g), the test cue (neuron 2), the left–right turn direction (choice) (neuron 3) and the combination of the sample cue and test cue that indicates the reward direction (neuron 4). Note that the reward direction (combination of the sample and test cues) and choice are identical on correct trials and opposite on incorrect trials. We also identified neurons that were active only on one of the four trial types in both correct and incorrect trials, thus encoding multiple task variables (neuron 5; Fig. 1g).

Vine copula models to analyze encoding in multivariate neural and behavioral data

To quantify the selectivity of neurons for a task variable, we isolated the contribution of the variable to a neuron’s activity while controlling for other variables that also contribute. This was important because neural activity is modulated by movements of the mouse^19,20,31, and a mouse’s movements correlate with task variables (Extended Data Fig. 2a,c). We considered the locomotor movements used to control the virtual environment.

We adapted nonparametric vine copula (NPvC) models to estimate the multivariate dependence among a neuron’s activity, task variables and movement variables (Fig. 2a). This method expresses the multivariate probability densities as the product of a copula, which quantifies the statistical dependencies among all these variables, and of the marginal distributions conditioned on time, task variables and movement variables^32,33,34. The mutual information between two variables depends only on the copula and not on the marginal distributions³⁵. Using a sequential probabilistic graphical model called the vine copula^32,34, we broke down the complex, data-hungry estimation of the full multivariate dependencies into a sequence of simpler, data-robust bivariate dependencies that were estimated using a nonparametric kernel-based method³⁵ (Fig. 2a). This approach takes into account correlations between all the variables in the multivariate probability, does not make assumptions about the form of the marginal distributions and their dependencies, and is able to capture nonlinear dependencies between variables, thus providing advantages over conventional methods such as generalized linear models (GLMs)^24,36,37. By discounting collinearities between task and behavioral variables, this method isolates the contribution of individual variables and improves the estimation of information in neural activity (Extended Data Fig. 4). Using the NPvC, we estimated the expected activity of a neuron for any value of task and movement variables and at any time in the trial (Fig. 2b). The NPvC predicted frame-by-frame held-out neural activity better than a GLM (Fig. 2c)^24,36,37.

**Fig. 2: Vine copula modeling of neural activity.**

We used the NPvC model to estimate the mutual information that could be decoded from a neuron’s activity about each task variable at each time point. This was computed as the information between the actual value of the task variable and the one decoded from the posterior probabilities of task variables computed with the NPvC, conditioned on all other measured variables^19,24. In simulations of neural activities modulated either linearly or nonlinearly by movement variables, the NPvC outperformed a GLM in fitting the data with nonlinear dependencies to movement variables (Extended Data Fig. 4 and Supplementary Note ‘Comparison of the performance of NPvC and GLM on simulated neural population data’). The NPvC and GLM both correctly estimated the information conveyed by individual neurons and the information from neuron pairs, conditioned upon movement variables, when the tuning to behavioral variables was linear. When the tuning was nonlinear, the GLM underestimated the neuronal information, whereas the NPvC performed well. Thus, the NPvC provides a more accurate estimate of information and is more robust to nonlinear tuning, consistent with its better fits to the data.

Preferential, but widespread, routing of information

While our focus is on population codes in projection pathways, we first established the information encoded in single neurons and whether that information is specialized in projection pathways. The PPC neurons contained information about each task variable, even after conditioning on the movement variables (Fig. 3a). Sample cue information was high in the sample, delay and test segments. Both sample cue and test cue information were appreciable in the early part of the test segment when the cues needed to be combined to inform a choice (Fig. 3a, left). The PPC neurons thus carried information about the reward direction (combination of the sample and test cues) and the choice, but the choice information was larger, indicating that PPC activity was more related to the turn direction selected by the mouse than the reward direction defined by the cues (Fig. 3a, middle). In addition, PPC neurons contained information about the mouse’s movements (Fig. 3a, right). Individual neurons contained relatively low information, likely due to transient activity patterns, trial-to-trial variability and multiplexing of information about different variables^{12,19,20,24,26}.

**Fig. 3: Single-neuron information in labeled projection neurons and nonlabeled cells.**

Information for the sensory-related task variables—sample cue, test cue and their combination that indicates the reward direction—was enriched in ACC-projecting neurons and lowest in contralateral PPC-projecting cells (Fig. 3b). Thus, PPC preferentially transmits sensory information to ACC. In contrast, information about the choice and movements was similar across the projection types, indicating that this information is more uniformly transmitted (Fig. 3b). All three projection types had lower information about the mouse’s movements than the unlabeled cells, suggesting that movement information is enriched in neurons projecting to areas not studied here (Fig. 3b, right). Cells projecting to contralateral PPC often had less information about each variable than the unlabeled cells, suggesting that across-hemisphere communication is less critical for encoding specific task and movement events. RSC-projecting neurons carried the sensory and choice information typical of the PPC population (Extended Data Fig. 3a). Therefore, neurons projecting to different targets differ in their encoding, revealing a specialized routing of signals. However, each projection contains significant information about each variable, showing that PPC also broadcasts its information widely. To understand population coding in projection pathways, we focused on choice information because it is the largest contributor to task-related variables.

Enriched IE pairwise interactions in neurons projecting to the same target

The structure of correlated activity patterns in populations of neurons can impact the transmission and reading out of information^7,38. We computed pairwise noise correlations^7,38, defined as the correlations in activity for a pair of neurons for a fixed trial type (Methods; Fig. 4a). We focused on the first 2 s after the test cue onset. Remarkably, noise correlations were significantly larger in pairs of neurons projecting to the same target than in unlabeled neurons with unidentified projection patterns, both for signed values (Fig. 4b, left, and Extended Data Fig. 5a) and absolute values (Extended Data Fig. 5b), suggesting that correlations are a key part of coding in output pathways. Noise correlations were higher on correct trials than on incorrect trials, consistent with the possibility that correlations are functionally relevant in guiding behavior²⁵ (Fig. 4b and Extended Data Fig. 5a). We also considered that behavioral variability within a given trial type could contribute to trial-to-trial variability and thus potentially to noise correlations. After using the single-neuron NPvC models to compute partial correlations regressing out the effect of movement variability, noise correlations were lower, confirming that movement variability contributed to traditional noise correlation measures (Fig. 4b, right, and Extended Data Fig. 5a). Even in this case, noise correlations were higher in neurons projecting to the same target than in pairs of unlabeled neurons and higher in correct trials (Fig. 4b, right, and Extended Data Fig. 5a). These patterns of noise correlations were present even for pairs of neurons with similar separation in anatomical distance (Extended Data Fig. 5c).

**Fig. 4: Pairwise interactions between pairs of nonlabeled neurons and pairs of neurons projecting to the same target.**

Depending on the well-characterized relationships between signal and noise correlations, noise correlations can either reduce or enhance the information in neural populations^7,38. To quantify how much a neuron pair’s noise correlations increase (IE) or decrease (information-limiting (IL)) information about a task variable, for each pair of neurons, we computed the interaction information (Fig. 4a). This was defined as the mutual information between the actual value of the task variable and the value decoded from the pair’s activity using NvPCs trained with noise correlations minus the information between the variable’s actual value and the value decoded from the pair’s activity using NvPCs trained blind to noise correlations. The former refers to the actual information carried by the pair, and the latter refers to independent information.

Interaction information was on average positive, and thus noise correlations were on average IE (Fig. 4c). Remarkably, interaction information on correct trials was larger in pairs of neurons projecting to the same area compared to pairs of unlabeled cells (Fig. 4c). Furthermore, interaction information was higher on correct trials than on incorrect trials. For pairs of neurons projecting to the same target, interaction information was even closer to zero on incorrect trials (Fig. 4c). In contrast, in single neurons, choice information was similar between correct and incorrect trials (Fig. 4d), suggesting a specific role of neuron–neuron interactions for generating correct behavioral choices. Similar results were obtained when comparing decoding performance (fraction correct) of NvPC decoders trained with or without correlations (Extended Data Fig. 5g). Therefore, pairwise interactions differ in populations projecting to the same target relative to surrounding neurons, enhance the information in an output pathway and may aid accurate decisions.

Interestingly, there was a wide range of interaction information values across pairs of neurons, with extended and almost symmetric tails for pairs of neurons with IE (significantly positive) and IL (significantly negative) interaction information (Fig. 4e and Extended Data Fig. 5f). Pairs of neurons for which the sign of signal correlations (that is, similarity of choice selectivity of the neurons) was the same as the sign of noise correlations had IL interactions, whereas pairs with opposite signs for signal and noise correlations had IE interactions^38,39,40 (Extended Data Fig. 5k,l). For both IE and IL pairs, interaction information was higher in absolute value in correct trials compared to error trials, consistent with higher noise correlations resulting in greater levels of both types of interactions⁴¹ (Extended Data Fig. 5e). The IE interactions had higher magnitude interaction information in pairs of neurons with the same projection target compared to unlabeled pairs, whereas IL interactions had similar magnitudes of interaction information between same projection and unlabeled pairs (Extended Data Fig. 5e). Thus, PPC contains a rich mix of IE and IL pairs, with IE interactions having a preferential contribution to output pathways.

We also computed the interaction information for individual projection targets and the sample cue and test cue (Extended Data Fig. 5d). The RSC-projecting pairs had similar interaction information about the mouse’s choice compared to unlabeled pairs, consistent with RSC-projecting neurons being most like the unlabeled population. Test cue information had similar but weaker results to choice information. For sample cue information, interaction information was not different between cells projecting to the same target and unlabeled pairs. Thus, there exists a possible specificity of interactions between neurons with respect to different task variables and projection pathways.

The network structure of pairwise interactions

Two networks with the same set of IL and IE interactions can differ in how these interactions are organized within the network (Fig. 5a,b). For example, the same set of interaction pairs can be distributed either randomly (Fig. 5a, top) or structured as clusters containing enriched IE or IL interactions (Fig. 5a, middle and bottom).

**Fig. 5: The presence of network structure of interaction information in populations projecting to the same target.**

Graph-theoretic measures can identify structured arrangements of pairwise links in a network. The simplest motifs in a graph beyond pairs are interaction triplets^42,43. A network with IL clusters has a larger number of ‘−,−,−’ triplets compared to a random network, and a network with IE clusters has more ‘+,+,+’ triplets compared to a random network, where ‘−’ and ‘+’ indicate IL and IE pairwise interaction links, respectively (Fig. 5a,b). For interactions of choice information, we computed the difference in the probabilities of ‘−,−,−’, ‘−,−,+’, ‘−,+,+’ and ‘+,+,+’ triplets between our data and an unstructured network obtained by randomly shuffling the position of pairwise interactions within the network, without changing the set of interaction values. In the unlabeled population, triplet probabilities were like those in a random network, indicating that the pairwise interactions are not structured (Fig. 5c). However, in populations of neurons projecting to the same target, there was an enrichment of ‘+,+,+’ and ‘−,−,+’ triplets and fewer ‘−,−,−’ and ‘−,+,+’ triplets compared to a random network. Notably, this structure was present only on correct trials and not when mice made incorrect choices (Fig. 5c). The network structure was present in ACC-projecting, RSC-projecting and contralateral PPC-projecting populations for choice information but was less apparent for sample cue and test cue information (Extended Data Fig. 6a,b).

We then used a graph global clustering coefficient^42,43 to measure IL or IE clusters. This coefficient compares the frequency of specific triplets in real data to a shuffled network, measured as the ratio between the number of specific closed triplets and the number of all triplets, normalized to the same quantity computed from shuffled networks. On correct trials, the clustering coefficient obtained from data was larger compared to a shuffled network for IE interactions, meaning that these interactions were clustered. In contrast, for IL interactions, the clustering coefficient was smaller than for a shuffled network, indicating that they were set apart (Fig. 5d). This clustering was not found in unlabeled neurons and when mice made incorrect choices (Fig. 5d). Thus, pairwise interactions are clustered in the network of neurons projecting to the same target.

To exemplify how the network’s topology relates to the triplet distributions, we considered a simple two-pool model in which the network structure can be varied parametrically, while keeping the overall set of values for IE and IL interactions constant (Fig. 5e). Each pool was parameterized by the difference in probability of IE pairwise interactions within the pool relative to a random network. Thus, the range of possible network models resides in a two-dimensional (2D) space consisting of the enrichment of IE interactions in pool 1 along one axis and in pool 2 along the second axis (Fig. 5f). The diagonal corresponds to symmetric networks, in which both pools have more IE or more IL interactions than a random network. The antidiagonal corresponds to asymmetric networks, where one pool has more IE interactions and the other has more IL interactions. Networks near the origin are like a random network.

For every point in the 2D space of network models, we analytically computed the triplet probabilities compared to a random network and then mapped the triplet probabilities estimated in our data to the parametric model (Fig. 5h, left, and 5i). Populations of cells projecting to the same target mapped to a symmetric network with both pools having enriched within-pool IE interactions and elevated across-pool IL interactions (Fig. 5i, right). In contrast, the population of unlabeled neurons mapped closely to the origin, corresponding to a randomly structured network (Fig. 5i, left).

This network structure was present in pools of neurons defined by their choice selectivity. For unlabeled neurons, the interaction information and proportion of IE interactions were similar to those of a random network, both within pools of neurons with similar choice preferences and across pools with different preferences (Fig. 5j,k). In contrast, for neurons with the same projection target, pools of cells with the same choice preference had higher interaction information and a larger proportion of IE interactions than the random network (Fig. 5j,k). Furthermore, in these same-target projection populations, cell pairs with opposite choice preferences had lower interaction information and an enrichment of IL interactions. This structure was strong on correct trials and largely absent when mice made incorrect choices.

Together, these results reveal rich structure in the pairwise interaction information that is approximated by a network of symmetric pools with enriched within-pool IE interactions and across-pool IL interactions. Notably, this structure was only present in neurons projecting to the same target, not in neighboring unlabeled neurons, and only when mice made correct choices. This finding could support the propagation of information to downstream targets, leading to accurate behavior.

The contribution of the network structure of pairwise interactions to population information

To test how the network-level structure of pairwise interactions affects the information encoded in a neural population¹¹, we analytically expanded the information encoded by the population about a given task variable as a sum of contributions of interaction graph motifs—nodes (single-neuron information), links (pairwise interaction information) and triplets (triplet-wise arrangements of pairwise interaction information; Fig. 6b). We illustrate this expansion for the case of a symmetric network with the assumption that single-pair information is small compared to single-neuron information, which describes our data well (Fig. 6b–d). For calculating information, we used an expansion of population information that is also valid for nonsymmetric networks (Extended Data Fig. 8d and Supplementary Note ‘Analytical expansion of the population information’).

**Fig. 6: The contribution of network structure of interaction information to the population information.**

We broke down the population’s total information about a task variable into three components (Fig. 6a). The independent information is the information if the population had the same single-neuron properties but zero pairwise noise correlations^7,39,44,45. The interaction information, which is the difference between the total population information and the independent information, was divided into two components. The unstructured interaction information is the information in a population with the same values of pairwise interactions as the data, but randomly rearranged across neurons by shuffling (no network structure), minus the independent information. This component quantifies the contribution of pairwise interactions to the population information. The structured interaction information is defined as the difference between the total population information and the population information in an unstructured network (Fig. 6a). This component isolates the contribution of the structure of pairwise interactions within the network.

The structured interaction information depends on the triplet-wise arrangements of pairwise interaction information. Based on analytic calculations, each triplet contributes to the structured interaction information with a sign that depends on the product of the signs of its pairwise interactions (Fig. 6c and Supplementary Note ‘Relating structured interaction information and triplet probabilities’). There is a positive contribution for {‘+,+,+’, ‘−,−,+’} triplets and a negative contribution for {‘−,−,−’, ‘−,+,+’} triplets. The sign of the structured interaction information, therefore, depends on the difference in the probability of triplets relative to a shuffled network lacking structure. The magnitude of the structured interaction information depends on these triplet probabilities as well as on the single-neuron information and pairwise interaction values within the triplets.

To visualize how structured interaction information changes with the topology of the network, we computed the value of the structured interaction information in the simple two-pool network defined in ‘The network structure of pairwise interactions’, making the additional assumption that the values of single-neuron information and the absolute values of interaction information of each triplet are homogeneous across the network (Fig. 6d). Symmetric networks with enriched IE pools (compared to a randomly structured network) have structured interaction information that is IE, whereas symmetric networks with enriched IL pools have negative structured interaction information (Fig. 6d).

Using the single-neuron information, pairwise interactions and triplet probabilities estimated from our data, we computed the independent, the unstructured interaction and the structured interaction components of the population information for choice (for a symmetric network (Fig. 6e) and for a nonsymmetric network (Extended Data Fig. 8d)). The independent information was the largest component and was similar between neurons projecting to the same-target and unlabeled populations (Fig. 6e, left). The unstructured and structured interaction information contributed less to population information, as expected, because pairwise interaction information values were smaller than single-neuron information values (Fig. 4c,d). Strikingly, the contributions of both structured and unstructured interaction information were markedly larger for neurons projecting to the same target than for unlabeled neurons (Fig. 6e, middle and right). The larger unstructured interaction information for populations projecting to the same target is a consequence of their higher pairwise interaction values. The higher structured interaction information for populations projecting to the same target is instead a consequence of their network structure (Extended Data Fig. 8e). The structured interaction information contributed close to zero in the unlabeled population, consistent with the lack of network structure in this population. The structured interaction information in the cells projecting to the same target was positive and IE, consistent with this population having a symmetric network structure with enriched within-pool IE interactions and across-pool IL interactions compared to a randomly structured network.

Notably, both mathematical calculations (Extended Data Fig. 10 and Supplementary Note ‘Analytical expansion of the population information’) and numerical evaluations (Fig. 6e) show that the contribution of structured interaction information scales faster with increasing population size than the two other components (Fig. 6e). Also, the structured interaction information was mostly absent on incorrect trials, whereas independent information was largely unchanged, suggesting that the structure of network interactions may carry information to guide accurate behavior. We compared the size of the structured interaction information to the information needed for behavioral choices. On correct trials for a population of 75 cells, the 0.015 bits carried by the structured interaction information (Fig. 6e) contribute 1.5% of the information needed to always make a correct choice (1 bit) and 5% of the information needed to perform as well as the mouse (~80% correct performance corresponds to ~0.3 bits of mutual information between the rewarded choice and the mouse’s choice; dashed line in Extended Data Fig. 1a). Based on this and the supra-linear scaling with population size, the structured interaction information in a population of a few hundred cells is expected to contribute a sizeable fraction of the total information needed to make correct choices.

We found similar trends for network structure for choice information for populations of neurons projecting to contralateral PPC, RSC and ACC (Extended Data Fig. 7c). Consistent with differences in triplet probabilities for sample cue and test cue information with respect to the choice information (Extended Data Fig. 6a), the network information was either IE or IL when computed for sample cue or test cue information (Extended Data Fig. 7b), suggesting the specificity of network structure with respect to the information content.

Together, these results reveal a rich network structure in the pairwise interactions within a population that strikingly enhances information levels and is only present in cells projecting to the same target and when mice make correct choices. Thus, populations of PPC cells projecting to the same target are organized in a specialized manner to enhance the propagation of information to downstream targets, potentially aiding accurate decision-making.

Discussion

The understanding of population codes has been built on studies of coding in populations of all neurons recorded from a given location, likely including neurons projecting to different target areas and thus mixing neurons that are read out by different downstream networks. Instead, by studying population codes in neurons that project to the same target area, we discovered that neurons comprising an output pathway in PPC form specialized population codes.

A first specialized feature is an elevated strength of pairwise correlations. Both IL and IE correlations are higher when mice make correct choices relative to incorrect ones, suggesting that they both have functional relevance. The enrichment of IE correlations in neurons projecting to the same target, compared to unlabeled neighboring neurons, suggests that these interactions aid the transmission of information between areas. On the other hand, IL correlations may support accurate behavior by increasing the efficacy and robustness of information transmission^24,25,46. However, because IL correlations are similar in neurons projecting to the same target and neighboring cells, they may be of general utility and not specialized to communication across areas.

A second specialized feature is a network-level structure of pairwise interactions that enhances the population’s information about the mouse’s choice. This enhancement arises from a diverse mix of IE and IL pairwise interactions. While the information provided by the network structure is small relative to that of single neurons in small populations, we estimate that it grows rapidly with population size and provides a substantial fraction of the information carried by the whole population projecting to a specific area. In addition, the network-level structure is present only when mice make correct choices, is absent when mice make incorrect choices and is not apparent in populations of neighboring neurons. Thus, the enhanced information encoding carried by this structure may be key to correct behavior and specialized for information transmission between areas. Future work should investigate whether structured synaptic connections between cells within the same output pathways contribute to this network structure^11,47.

We established a framework to link network-level structures to their effect on population codes. Extending this formalism, possibly with tools to identify interacting populations across areas^2,48,49, may allow studies of how the network structure of information encoding within a ‘sending’ area affects transmission to downstream areas. For example, specific triplets, such as ‘−,−,+’ triplets, may be particularly useful for information transmission because they enhance information while containing IL correlations¹¹. Previous work has focused on how pairwise and higher-order correlations shape the information encoded in a population^{7,38,41,50,51,52,53}, without considering how these interactions are arranged as a network. Recent empirical work has identified network structures in visual cortex⁸, and theoretical studies indicate how network structures may impact the encoding of information^7,11. Our results contribute by demonstrating how structured interactions in populations comprising an output pathway enhance information encoding.

We developed NPvC models that more effectively discount covariations between task and movement variables and thus better estimate the relationships between variables in a large multivariate setting, compared to simpler models^19,20,31,54 (Extended Data Fig. 4). If covariations between task and movement variables are not accurately accounted for, dependencies across variables can mask underlying IE and IL interactions and prevent discovering network structures (Extended Data Fig. 4e,f). In simulations, the NPvC had more accurate single-cell information estimates compared to a GLM when the modulation by behavioral variables is nonlinear (Extended Data Fig. 4 and Supplementary Note ‘Comparison of the performance of NPvC and GLM on simulated neural population data’). Moreover, when the modulation is nonlinear, the NPvC better estimates the interaction information because undiscounted effects of cotuning to behavioral variables generate artificial redundancy. Consequently, the NPvC provides a better estimate of the distribution of pairwise interactions within the network compared to a GLM. In our empirical data, the NPvC fit the neural activity better than the GLM. For single-neuron information, a GLM reproduced the overall distribution of single-neuron information across task variables and projection targets, although with reduced values (Extended Data Fig. 9a). The GLM showed positive, although reduced, interaction information in correct trials, with the reduced values arising because of redundancy overestimation (Extended Data Fig. 9c). However, the GLM did not reveal a network structure of the interactions (Extended Data Fig. 9d). Therefore, NPvC models are beneficial for understanding information in large populations and complex behavioral settings.

While PPC broadcasts information about all task variables to ACC, RSC and contralateral PPC, we observed some specificity in the types of information transmitted in these pathways. Sensory-related information was strongest in ACC-projecting neurons, consistent with sensory and motor communication pathways between parietal and frontal cortices⁵⁵. In contrast, contralateral PPC projections carried less information about the task, implying that cross-hemisphere interactions serve a role other than computing specific task quantities and may instead help maintain activity patterns⁵⁶. All three intracortical projection pathways had lower information about the mouse’s movements, indicating that cells projecting to other targets, perhaps subcortical areas, might be more related to the biasing of movements⁵⁷. Collectively, these findings both support the notion of specificity in the information flow in cortex^58,59,60,61 and fit with findings of highly distributed representations across cortex and work that has failed to identify categories of neurons in PPC based on activity profiles^{19,20,22,30,31,61,62,63}.

Together, our findings demonstrate that specialized network interaction structures in PPC facilitate the transmission of choice signals to downstream areas and that these structures may be important for guiding accurate behavior. Our results suggest that the organization principles of neural population codes can be better understood in terms of optimizing transmission of encoded information to target projection areas rather than in terms of encoding information locally.

Methods

Mice

All experimental procedures were approved by the Harvard Medical School Institutional Animal Care and Use Committee and were performed in compliance with the Guide for Animal Care and Use of Laboratory Animals. Imaging data were collected from ten male C57BL/6J mice that were 8 weeks old at the initiation of behavior task training (stock 000664, Jackson Labs).

Virtual reality system

The virtual reality system has been described previously²⁶. Head-restrained mice ran on an 8-inch diameter spherical treadmill. A PicoP micro-projector was used to back-project the virtual world onto a half-cylindrical screen with a diameter of 24 inches. Forward/backward translation was controlled by treadmill changes in pitch (relative to the mouse’s body), and rotation in the virtual environment (virtual heading direction) was controlled by the treadmill roll (relative to the mouse’s body). Movements of the treadmill were detected by an optical sensor positioned beneath the air-supported treadmill. Mazes were constructed using VirMEn in MATLAB⁶⁴.

Behavioral training

Before behavioral training, surgery was performed in 8-week-old mice to attach a titanium headplate to the skull using dental cement. At least 1 day after implantation, mice began a water schedule, receiving at least 1 ml of water per day. Body weights were monitored daily to ensure they were greater than 80% of the original weight. Mice were trained to perform the delayed match-to-sample task using a series of progressively complex mazes. Mice were rewarded with 4 μl of water on correct trials. First, naive mice were trained to run down a straight, virtual corridor of increasing length to obtain water rewards. In the second stage, mice learned to run in a T-shaped maze, making a choice between the left and right arms. The correct choice arm was signaled by the presence of a tall tower at the choice arm. After the mice were able to run straight on the ball and make turns, we trained the mice on stage 3, where we began to familiarize them with running into the choice arm that matched the color presented in the maze stem (sample cue). The walls of the stem were either black or white (sample cue; randomly selected from trial to trial). The left and right choice T-arms were black and white, respectively, or vice versa (test cue). At this stage of training, the correct choice arm was signaled both by a tall tower in the correct arm and by the T-arm that matched the color of the sample cue. Notably, at this stage, the mouse can still perform accurately even if it ignores the sample and test cues, as it can simply run to the choice arm with the tower. In the fourth stage, the maze was the same, except that we added a tower to the unrewarded choice arm as well. In this stage, the mouse cannot simply run to whichever arm has a tower (because both arms have towers) and must run into the arm that matches the color of the sample cue. In stage 4, the maze was exactly like the previous one, except the choice arms and towers (test cue) appeared gray as the mouse ran down the maze. The colors of the choice arms and towers (test cue) were not revealed until the mouse reached three-fourths of the way down the stem. When the black and white choice arms were revealed, the mouse could begin to plan and execute a turn left or right. In the final stage, we begin training toward the final implementation of the delayed match-to-sample task. We introduced a delay segment by making the walls of the T-stem gray at the end of the stem and gradually increasing the length of the stem that is gray (delay segment). The mouse received its first trials of the delayed match-to-sample task when at least the entire last one-fourth of the stem walls were gray. In such trials, when the mouse’s position reached three-fourths of the way down the stem, there was a short moment in which both the stem walls and the choice arms were gray. At this point, the mouse must rely on its memory of the initial sample cue walls to make the correct turn. We gradually increased the length of the gray segment until the mouse performed with over 85% accuracy, with a delay averaging at least 2 s in duration. The entire training program was completed in 12–18 weeks.

Surgery

When mice reliably performed the delayed match-to-sample task, the cranial window implant surgery was performed. Mice were given free access to water for 2 days before surgery. During the surgery, mice were anesthetized with 1.5% isoflurane and the headplate was removed. Craniotomies were performed over PPC centered at 2 mm posterior and 1.75 mm lateral to bregma. GCaMP6 was injected at three locations spaced 200 μm apart at the center of the PPC. A micromanipulator (MP285, Sutter) was used to position a glass pipette approximately 250 μm below the dura, and a volume of approximately 50 nl was pressure-injected over 5–10 min. Dental cement sealed a glass coverslip on the craniotomy, and a new headplate was implanted, along with a ring, to interface with a black rubber objective collar to block light from the VR system during imaging. Craniotomies were also performed over the contralateral PPC (2 mm posterior and 1.75 mm lateral to bregma) and ACC (1.34 mm anterior and 0.38 mm lateral to bregma, and 1.3 mm ventral to the surface of the dura). More than 200 nl of retrograde tracer (CTB-Alexa647, CTB-Alexa405 or red retrobeads) were injected at each site. The retrograde tracer injection into RSC was made through the craniotomy for PPC imaging, targeting the most medial portion of the craniotomy (~0.5 mm lateral from the midline). Craniotomies for the retrograde tracers were sealed with KwikSil (World Precision Instruments). Mice recovered in 2–3 days after surgery and then resumed their water schedule. Imaging was performed nearly daily in each mouse, starting 2 weeks after surgery, and continued for 1–2 weeks.

Obtaining anatomical stacks

Anatomical stacks of cells double-labeled with GCaMP and retrograde tracers were acquired using a two-photon microscope, taken at 2-μm intervals from the surface of the dura to approximately 300 μm below the surface. For images of cells double-labeled with GCaMP and CTB-Alexa647, we used a dichroic mirror (562-nm long pass, Semrock) and bandpass filters (525/50 nm and 675/67 nm, Semrock) and delivered the excitation light at 820 nm to visualize both GCaMP and the Alexa fluorophore simultaneously. We also took images of GCaMP and CTB-Alexa405 using a dichroic mirror (484-nm long pass) and bandpass filters (525/50 nm and 435/40 nm) and delivered excitation light at 800 nm. Finally, we took images of GCaMP and red retrobeads with dichroic mirror (562-nm long pass) and bandpass filters (525/50 nm and 609/57 nm) and delivered excitation light at 820 nm. These anatomical stacks enabled us to visually identify cells as either double-labeled with GCaMP and retrograde tracer or GCaMP-expressing and unlabeled with retrograde tracer. We then matched these GCaMP-expressing neurons from the anatomical stacks with the same GCaMP-expressing neurons during functional imaging, which was performed at 920 nm and during behavior. In some sessions, we took z-stacks immediately before functional imaging to ensure that some labeled cells were present in the field of view.

GCaMP imaging during behavior

For functional imaging, five mice were imaged using a Sutter MOM at 15.6 Hz at 256 × 64-pixel resolution (~250 × 100 μm) through a ×40 magnification water immersion lens (Olympus, NA 0.8). Five other mice were imaged using a custom-built two-photon microscope with a resonant scanning mirror (at 30-Hz frame rate) with a ×16 objective. ScanImage was used to control the microscope. Imaging sessions lasted 45–60 min. Each session was imaged over multiple 10-min acquisitions separated by 1 min, allowing any amount of GCaMP bleaching to recover. The imaging frame clock and an iteration counter in VirMEn were recorded to synchronize imaging and behavioral data.

Data processing

Custom-written MATLAB software was designed for motion correction, definition of putative cell bodies and extraction of fluorescence traces (dF/F). Fluorescence traces were deconvolved to estimate the relative spike rate in each imaging frame, and all analyses were performed on the estimated relative spike rate to reduce the effects of GCaMP signal decay kinetics.

To estimate the neural activity of individual cells from the calcium imaging data, we processed the data by the following steps: (1) motion correction—motion artifacts in the imaging data were corrected in each imaging frame. First, ‘line-shift correction’ was performed to align for line-by-line alternating offsets in images due to bidirectional scanning. Then, ‘sample movement correction’ was performed to remove between-frame rigid movement artifacts by FFT-based 2D cross-correlation⁶⁵ and within-frame nonrigid movement artefacts by the Lucas–Kanade method⁶⁶. (2) Cell selection—the spatial footprint of a putative cell was identified based on the correlation of fluorescence time series between nearby pixels. The correlation of fluorescence time series was calculated for each pair of pixels within a ~60 × 60-μm square neighborhood. Then, putative cells were identified by applying a continuous-valued, eigenvector-based approximation of the normalized cuts objective to the correlation matrix, followed by discrete segmentation with k-means clustering, which generated binary masks for all putative cells. (3) dF/F calculation—the magnitude of calcium transients was estimated by subtracting the background fluorescence from the raw fluorescence of a putative cell. For each putative cell, the background fluorescence of local neuropil was estimated by the average fluorescence of pixels that did not contain putative cells. Then, the neuropil fluorescence time series was scaled to fit the raw fluorescence of a putative cell by iteratively reweighted least-squares (robustfit.m in MATLAB) and was subtracted from the raw fluorescence to yield neuropil-subtracted fluorescence (F_sub). Then, dF/F was calculated as $({F}_{{\rm{sub}}}-{F}_{{\rm{baseline}}})/{F}_{{\rm{baseline}}}$, where ${F}_{{\rm{baseline}}}$ was a linear fit of ${F}_{{\rm{sub}}}$ using iteratively reweighted least-squares (robustfit.m in MATLAB). The codes used in steps 1–3 are available at https://github.com/HarveyLab/Acquisition2P_class.git. (4) Deconvolution—the timing of spike events that led to calcium transients was estimated by deconvolution of fluorescence transients. dF/F was deconvolved by OASIS AR1 (ref. ⁶⁷), which models the fluorescence of each calcium transient as a spike increase followed by an exponential decay, whose decay constant was fitted to each cell. The deconvolved fluorescence resulted in spikes that were sparse in time and varied in magnitude. The deconvolved fluorescence was used as a neural activity for the majority of the analyses.

NPvC models of single-neuron activity

To quantify the information carried by a neuron’s activity about task variables, while discounting possible contributions from movement variables, we built a multivariate probabilistic model of the activity of each neuron, time, behavioral variables (running velocity and acceleration) and task variables (schematized in Fig. 2c). We built this model using NPvC. We chose vine copulas because they allow constructing arbitrarily complex multivariate relationships (including relationships between non-neural variables, for example, relationships between trial types and behavioral variables) by combining bivariate relationships, which can be sampled accurately and robustly from a finite number of trials regardless of the details of the marginal distributions. We chose nonparametric estimators because the form of these relationships is not known a priori and we wanted to avoid biases originating from inaccurate assumptions. Here we describe the details of computing our nonparametric vine copula model, specifically for determining the probability density function of neural population activity given behavioral and task variables at each time point in the trial (Fig. 2b) and for assessing its goodness of fit (Fig. 2c).

We used vine copulas to estimate the conditional joint probability density function $f\left(\bf{x}|\varGamma \right)$ for a set of variables $\bf{x}=\{r,t,\bf{B}\}$, which, in our case, consists of the activity of a neuron ${x}_{1}\equiv r$, time ${x}_{2}\equiv t$ and the 5D vector $\bf{B}$ of the five behavioral variables that we measured (virtual heading direction, lateral and forward running velocities, and lateral and forward accelerations), for each trial type Γ. Each trial type Γ (Γ =1 … 8) is defined by the sample cue, the test cue and trial outcome (correct or incorrect). Thus, there are four trial types for trials with correct choices and four trial types for trials with incorrect choices. Using the copula decomposition, the probability density function for each trial type Γ is represented as a product of the single-variable marginal probability density functions $f\left({x}_{i}|\varGamma \right)$, and the copula $c(\bf{x}|\varGamma)$, which captures the dependencies between all the variables, as follows:

$$f\left(\bf{x}|\varGamma \right)=\mathop{\prod }\limits_{i}f\left({x}_{i}|\varGamma \right)c\left(\bf{x}|\varGamma \right)$$

(1)

We used a kernel density estimator⁶⁸ to compute the single-variable marginal probability densities and a nonparametric c-vine copula to estimate the copula, representing the correlation structure between variables. We used a c-vine graphical model with neuron activity as the central variable. The probability density function of a c-vine can be expressed as the product of a sequence of nonparametric bivariate copulas^32,69 (Supplementary Note ‘Vine copula modeling of neural responses’). We used similar order for the variables in the c-vine graphical model as $\bf{x}=\{r,t,\bf{B}\}$ introduced above. Furthermore, we simplified the vine copula structure by considering the decomposition of the copula as a product of a time-dependent and a time-independent component $c\left(\bf{x}|\varGamma \right)=c\left(r,t|\varGamma \right)c(r,\bf{B}|\varGamma )$, meaning that we assumed the tuning of neurons to movement variables is time independent. The sequence of bivariate copulas shaping the vine copula was then fitted to the data using a sequential kernel-based local likelihood process (Supplementary Note ‘Nonparametric pairwise copula estimation’)³⁵. For each of the bivariate copulas, the kernel bandwidth was fitted to maximize the local likelihood obtained using a fivefold cross-validation method³⁵. Using the estimated bandwidths for each copula in the vine sequence, we computed the multivariate copula density function of data points using a fivefold cross-validation process. We first used the training set to estimate the copula density on a 50 by 50 grid (Supplementary Note ‘Nonparametric pairwise copula estimation’) and then used the copula estimated on the grid point to interpolate the copula density on the test set. A similar procedure was followed to estimate cross-validated marginal density functions for each single variable. The conditional probability of neural responses in each trial condition and for each value of behavioral variables is computed using $f(r|t,\bf{B},\varGamma )=\frac{f\,(r,t,\bf{B}|\varGamma )}{f\,(t,\bf{B}|\varGamma )}$, where, to compute the density function $f(t,\bf{B}|\varGamma )$, we marginalized by integrating over the neuron response

$$f\left(t,\bf{B}|\varGamma \right)={\int }_{{\!r}_{\min }}^{{r}_{\max }}f\left(r,t,\bf{B}|\varGamma \right){\mathrm{d}r}\approx \mathop{\sum }\limits_{i=1}^{n}f\left({r}_{i},t,\bf{B}|\varGamma \right)\delta r$$

(2)

In equation (2), we approximated the integration as a sum over a set of n = 200 points ${\{r}_{i}\}$ ranging linearly between the minimum (${r}_{\min }$) and maximum (${r}_{\max }$) value of the neural response within the session and $\delta r=\frac{{r}_{\max }-{r}_{\min }}{n}$. By computing the density function $f(t,\ldots ,\bf{B}|\varGamma )$ by marginalization of $f(r,t,\bf{B}|\varGamma ),$ instead of fitting a new vine model between ($t,\bf{B}$) variables, any difference between these two density functions is only related to the dependency of the neural activity to other variables and not to a difference between how dependencies between ($t,\bf{B}$) variables are being quantified in the two models because of the fitting differences.

For each time point, we computed a copula fit of the activity of each neuron (indicated as NPvC fit in Fig. 2b) as a point ${r}_{\mathrm{cop}}\in {\{r}_{i}\}$ with the largest log likelihood given the considered trial type and the values of behavioral variables at the considered time point:

$${r}_{\mathrm{cop}}={{\underbrace{{\rm{argmax}}}_{{r}_{i}}}}\log f\left({r}_{i}|t,\bf{B},\varGamma \right)$$

(3)

GLMs of single-neuron activity

We also fit a GLM model to compare it with the copula. We thus used a GLM with Poisson noise, a logarithmic link function and an elastic-net regularization^24,27,36. We also used task variables defining the trial type, consisting of sample cue, test cue, choice and all their interactions (each binary task variable was coded as −1 or +1) together with the same behavioral variables we used in copula modeling as the predictors. Time-dependency during the task for single neurons was expanded using a raised cosine basis²⁷ as follows:

$$g\left(t\right)=\left\{\begin{array}{cc}\frac{W}{2}\left[1+\cos \left(2\pi \left(t-{t}_{c}\right)\right)\right], & \left|t-{t}_{c}\right| < 0.5\\ 0, & \mathrm{otherwise}\end{array}\right.$$

(4)

For task variables, the cosine basis function has a width of 1 s, and the value at the center peak was either positive ($W=+1$) or negative ($W=-1$) depending on the identity of each task variable. The center peaks ${t}_{c}$ were spaced with a 0.5-s interval to tile the epoch with a half-width overlap. For selectivity to movements of the mouse, we first z scored each movement variable as ${z}_{i}$ and used a similar cosine basis as follows:

$$g\left({z}_{i}\right)=\left\{\begin{array}{cc}\frac{1}{2}\left[1+\cos \left(2\pi \left({z}_{i}-{z}_{c}\right)\right)\right], & \left|{z}_{i}-{z}_{c}\right| < 0.5\\ 0, & \mathrm{otherwise}\end{array}\right.$$

(5)

We considered 13 center peaks ${z}_{c}$ ranging from −3 to 3 with spacing of 0.5 for the cosine basis of movement variables.

Computation of neural data fitting performance of NPvC and GLMs

The performance of both the NPvC and the GLM in fitting single-trial single-neuron activity was evaluated by computing the fraction of the deviance explained (FDE) on the test data, defined as follows⁷⁰:

$$\mathrm{FDE}=\frac{{L}_{\mathrm{model}}-{L}_{\mathrm{null}}}{{L}_{\mathrm{sat}}-{L}_{\mathrm{null}}}$$

(6)

where ${L}_{\rm{model}}$, ${L}_{\rm{null}}$ and ${L}_{\rm{sat}}$ are the likelihoods of observing the test data for the considered model (copula or GLM), the null model or saturated model, respectively, and are always computed for all the time points in all trials in a cross-validated fashion. For the GLM, the null model is a model that does not have any predictors, and its prediction of activity at any time point is the time-averaged rate of the neuron. For the copula null model, we defined a model of neural response that excluded all predictors from the trial, quantifying the proportion of neural activity that can be explained without knowledge of the time and movement variables used in the vine copula model. The likelihood is then computed using the marginal distribution of neural activity ${L}_{\mathrm{null}}=\log [f({r}_{\mathrm{data}})]$. The saturated model is the generative model in which the prediction exactly matches the observed activity at each time point in the test data. For the GLM, these values were computed using the analytical form of the GLM output at each time point using the logarithmic link function. For the copula model, which is nonparametric, the vine copula model is used to compute the likelihood for each neural activity ${L}_{\mathrm{model}}=\log [f({r}_{\mathrm{data}}|t,{\bf{B}},\varGamma )]$ as the model likelihood, where ${r}_{{\rm{data}}}$ is the real single-trial neural activity at time t from data. To compute the saturated likelihood, we considered the fact that the model prediction is derived from equation (3) and the saturated likelihood was considered to be the peak likelihood obtained ${L}_{\mathrm{sat}}=\max (\log [f({r}_{i}|t,\bf{B},\varGamma )])$, where $\{{r}_{i}\}$ is the set of n = 200 grid points on the neural activity described before ranging from the minimum to maximum possible neural activity of each neuron. Each of the likelihood values on the data were computed using a fivefold cross-validation by fitting the NPvC on the training folds and estimating it on the test fold. The density function is estimated over a grid using the training set, and the grid is used to compute the density function over the test set (Supplementary Note ‘Nonparametric pairwise copula estimation’). We considered five different task epochs and fitted the model to each of them. The periods were from 0.5 s before to 2 s after the sample cue onset, delay onset, test cue onset, start of the turn and reward onset. We used only sessions that had at least five trials of each of the eight trial types.

Estimation of mutual information for single neurons

We used probabilities estimated from the vine copula approach, explained in ‘NPvC models of single-neuron activity’, to estimate single-neuron mutual information values (Fig. 3). To compute the mutual information between a group of variables (including neural activity and/or behavioral variables) and a task variable, we used a decoding approach and then computed information in the confusion matrix^71,72,73. Here a task variable (c) was a binary variable taking a value c of +1 or −1 to indicate one of the two possible values of either sample cue, test cue, reward location or choice direction. For each task variable, we decoded its value in each trial using the copula model to compute (through Bayes’ rule) the posterior probability of the task variable given the observation in the same trial of a set of variables $\bf{z}$, which could be the full set $\bf{x}=\{r,t,\bf{B}\}$ (Extended Data Fig. 2b). or a subset of it. This posterior probability of c can be obtained from the copula model as follows:

$$P\left(c|\bf{z}\right)=\mathop{\sum}\limits_{\Gamma }\frac{f\left(\bf{z}|\varGamma \right)P\left(\varGamma \right)}{{\sum }_{\Gamma }f\left(\bf{z}|\varGamma \right)P\left(\varGamma \right)}$$

(7)

where the sum is over all the trial types $\varGamma$ (corresponding to the combinations of two sample cues, two test cues and two trial outcomes (correct or incorrect) and consists of eight possible values when using all trials and four values when using either correct or incorrect trials), which have value c for the considered task variable. For example, considering analyses limited to correct or incorrect trials, the sum will be over the four correct or incorrect trial types. Considering analyses using all the trials, the sum will be over all eight trial types. Considering sample cue correct trials only, the sum in equation (6) will be over the two trial types with correct outcome and two sample cues, and so on. $P\left(\varGamma \right)$ is the probability of occurrence of trial type Γ across all considered trials. In the above equation, the probability density of the variable $\bf{z}$ given the trial type Γ, $f\left(\bf{z}|\varGamma \right)$, is computed using the copula model by using equation (1) when $\bf{z}=\{r,t,{\bf{B}}\}$ or using equation (2) when $\bf{z}=\{t,\bf{B}\}$.

We then used the posterior probabilities to decode the most likely task variable given the variables $\bf{z}$ observed in the considered trial:

$${\hat{c}}={\underbrace{\mathrm{argmax}}_{{c}^{\prime}\in \left\{-1,+1\right\}}}\;P\left({c}^{{\prime} }|\bf{z}\right)$$

(8)

The information about the task variable decoded from neural activity was then computed as the mutual information between the real value C of the variable and the one $\hat{C}$ decoded from neural activity, as follows:

$$I\left(C{;}\hat{{C}}\right)=\mathop{\sum }\limits_{c,\hat{c}}P\left(c,\hat{c}\right){\log }_{2}\frac{P\left(c,\hat{c}\right)}{P\left(c\right)P\left(\hat{c}\right)}$$

(9)

where $P\left(c,\hat{c}\right)$ is the confusion matrix, that is, the probability that the true value of the task variable is c and the value of the decoded one is $\hat{c}$, and $P\left(c\right)$ and $P\left(\hat{c}\right)$ are the marginal probabilities. We also computed the NVpC decoder’s performance as fraction of correct decoding, obtaining similar results (Extended Data Fig. 3f).

To compute the information $I(r{;}C{|}t,\bf{B})$ between the task variable and the neural activity at a given time point, conditioned over the behavioral variables, we used the following equation for conditional mutual information:

$$I\left({r;C|t},\bf{B}\right)=I\left(r,t,\bf{B}{;C}\right)-I\left(t,\bf{B}{;C}\right)$$

(10)

In the above equation, $I(r,t,\bf{B}{;}C)$ was computed after decoding $\hat{C}$ using the copula derived from equation (1) where $\bf{z}=\{r,t,\bf{B}\}$, and $I(t,\bf{B}{;}C)$ was computed after decoding $\hat{C}$ using the copula constricted with equation (2) where $\bf{z}=\{t,\bf{B}\}$.

We computed the mutual information $I(r{;}\bf{B}{|}t)$ between the neural activity and the behavioral variables $\bf{B}$ at time $t$ (used in Fig. 3a,b) using Shannon’s mutual information formula with probabilities defined by the copula $f(r|t,\bf{B},\varGamma )$, the marginalized copula $f\left(r|t,\varGamma \right)$ and the probability $P\left(\varGamma \right)$ of each trial type Γ as follows:

$$I\left({r;}\bf{B}{|t}\right)=\mathop{\sum }\limits_{\varGamma }P\left(\varGamma \right)H\left(r|t,\varGamma \right)-\mathop{\sum }\limits_{\varGamma }P\left(\varGamma \right)H\left(r|t,\bf{B},\varGamma \right)$$

(11)

The entropies $H(r|t,\bf{B},\varGamma )$ and $H\left(r|t,\varGamma \right)$ were computed by averaging the corresponding NPvC estimated log-likelihoods at the neural activity and movement values sampled from the trial type Γ at each time point.

For the information about the task variables, we chose a computation of information from the decoding matrix, similar to the details in, for example, ref. ²⁴, because when computing neural population information about categorical values, this computation is robust and has a lower variance and bias than a direct computation from the probabilities. The reason is that the decoding step compresses the dimensionality of the neural responses from which the information is computed, especially when considering neuron pair responses and interaction information. This computational robustness was useful because we wanted to use the information values for each neuron pair regarding task variables at both the single-neuron and single-pair levels. For the information computed about the behavioral variables, it was not useful to perform decoding and confusion matrix computations, as the space of behavioral variables was larger than that of neuronal activity. Therefore, we used a direct calculation of information through the Shannon formula based on the response probabilities. To correct for the limited sampling bias in the information estimation, we computed a shuffled information distribution by repeating the same process after shuffling the label for the trial type (for trial type information, see equation (9)) or the neural activity (for the equation (10) information) 1,000 times and subtracted the mean of the shuffled distribution from the estimated mutual information to correct for the bias⁷⁴ (Extended Data Fig. 2c).

To compare the magnitude of information values computed from neurons and neuronal populations to the information values needed to achieve mouse’s behavioral performance, we generated a set of simulations to convert task performance values (defined as the fraction of trials for which the mouse choice was rewarded) into its corresponding mutual information values (defined as the mutual information between rewarded choice and mouse’s choices in all trials). This relationship is reported as the left and right y-axis values in Extended Data Fig. 1a. To generate this relationship, we considered n trials (we used $n=200$ in results reported in Extended Data Fig. 1a as this was similar to the range of trial numbers in real experimental data) of equal numbers of left or right rewarded directions (represented by a vector C). Then, for various mouse performance values P (we considered 200 equally spaced performance values between 0.65 and 1), we generated a corresponding mouse’s choice vector (represented by a vector $\hat{{\bf{C}}}$) with a fraction $P$ of correct decisions. We then computed the empirical confusion matrix $P\left({\bf{C}},\hat{{\bf{C}}}\right)$, and from this probability matrix computed the mutual information $I\left({\bf{C}};\hat{{\bf{C}}}\right)$ between rewarded choice and the mouse’s choice using Shannon Information in equation (9), for any behavioral performance P. To remove any limited sampling bias, as we did for real data, we subtracted from $I\left({\bf{C}};\hat{{\bf{C}}}\right)$ its shuffled values, computed after destroying the information by randomly shuffling the vector $\hat{{\bf{C}}}$ (results average over 1,000 random shuffles).

Calculation of noise correlations and partialized noise correlations from pairs of neurons

We computed the noise correlations between each pair of neurons (Figs. 4–6) as Spearman (rank) correlations at fixed trial type, averaging across trial types. Noise correlations are usually computed as Pearson correlations at fixed trial type. Here we used Spearman correlations because we had evidence on nonlinearity of interactions (Extended Data Fig. 5j).

Because noise correlations computed, as mentioned above, may be affected by common tuning to behavioral variables, we also computed noise correlations partialized on the behavioral variables (Figs. 4–6) as rank correlations using the estimated cumulative distributive function (CDF) computed with the copula. This method generalizes previous computations of noise correlations, which have been partiallyized using linear regressions or GLM regression of the behavioral variables^24,75. We first computed the rank of each activity at each time t by evaluating its CDF computed through the copula (CDF is a measure of rank percentile⁷⁶, because it measures the fraction of points with a value lower than the considered one). To compute the partialized noise correlation, we pooled the CDFs computed for the activities in single trials, from the same trial condition, over the first 2 s after the test cue onset and computed the Spearman correlation coefficient. To compute the CDFs at fixed trial type, we first computed the conditioned marginal density functions $f\left({r|t},\varGamma \right)$ for each neuron and to compute noise correlations at fixed trial type conditioned on the behavioral variables, we used the NPvC to compute the conditional marginal density functions $f(r{|}t,\bf{B},\varGamma )$ that allow us to specify the rank specific to the observed value of the behavioral variables at the considered time point for each neuron. To compute the CDFs for each of these density functions, we first computed the CDF over a grid with 200 points on the n axis, ranging from the minimum to the maximum possible neural activity values by numerical integration of the density function (using the integral definition of CDF). We then used the CDF values on the grid to compute the CDF at each value of neural activity in each trial and time point by interpolation over the grid. Using CDFs and copulas to compute noise correlations has been shown to be robust and accurate in estimating noise correlations in neurons⁷⁷.

Estimation of interaction information using NPvC models for pairs of neurons

After the previous work⁷⁸, for any pair of neuron with activity ${r}_{1},{r}_{2}$, the interaction information about a task variable for the pair of neurons was defined as the difference between the pairwise information ${I}^{{\rm{pairwise}}}\left({r}_{1},{r}_{2}{;C}\right)$ about the task variables carried by the joint observation of the pair of neuron (which reflects both single-neuron properties and the effect of interactions between neurons) and the independent information ${I}^{{\rm{ind}}}\left({r}_{1},{r}_{2}{;C}\right)$, which reflects only single-cell properties and is defined the information for the pair of neurons when they are conditionally independent.

Notably, in computing interaction information, we also conditioned over the movement variables (as we did for as for single-neuron information) to isolate the part of the interaction that is related to the task variable selectivity. Conditioning removes the interaction emerging from shared movement selectivity and correlations between the task and movement variables. Thus, interaction information was computed as follows:

$${I}^{\,\mathrm{int}}\left({r}_{1},{r}_{2}{;C|t},\bf{B}\right)={I}^{\,\mathrm{pairwise}}\left({r}_{1},{r}_{2}{;C|t},\bf{B}\right)-{I}^{\,\mathrm{ind}}\left({r}_{1},{r}_{2}{;C|t},\bf{B}\right)$$

(12)

To compute the full and independent information components, we first estimated the pairwise and independent probability density functions ${f}^{\,\mathrm{pairwise}}({r}_{1},{r}_{2},t,\bf{B}|\varGamma )$ and ${f}^{\,\mathrm{ind}}({r}_{1},{r}_{2},t,\bf{B}|\varGamma )$ for each trial type, respectively. The full probability density function was estimated using the following breakdown of the density function:

$$\begin{array}{l}{f}^{\,\mathrm{pairwise}}\left({r}_{1},{r}_{2},t,\bf{B}|\varGamma \right)=f\left({r}_{1}|t,\bf{B},\varGamma \right)f\left({r}_{2}|t,\bf{B},\varGamma \right)\\ c\left({r}_{1},{r}_{2}|t,\bf{B},\varGamma \right)f\left(t,\bf{B}|\varGamma \right)\end{array}$$

(13)

The independent probability density function was estimated using a similar breakdown

$${f}^{\,\mathrm{ind}}\left({r}_{1},{r}_{2},t,\bf{B}|\varGamma \right)=f\left({\widetilde{r}}_{1}|t,\bf{B},\varGamma \right)f\left({\widetilde{r}}_{2}|t,\bf{B},\varGamma \right)f\left(t,\bf{B}|\varGamma \right)$$

(14)

where ${\widetilde{r}}_{1}$ and ${\widetilde{r}}_{2}$ are sampled from shuffled neuronal activities over trials of single-trial type and the independent information is estimated as the average over 1,000 such shuffles. The extra component in the full density function, equation (13), $c({r}_{1},{r}_{2}|t,\bf{B},\varGamma )$, is the neuron-pairwise copula term, which represents the interaction between pairs of neurons for the fixed trial condition $\varGamma$ and conditioned over the movement variables $\bf{B}$ at any time $t$. This term was computed (conceptually similarly to the computation of conditional noise correlation explained above) by building a bivariate copula between the values of conditioned probability functions of the pair of neurons. Thus, we use the definition of copula as the density function between the cumulative density functions of data, $c({r}_{1},{r}_{2}|t,\bf{B},\varGamma )=c(F({r}_{1}{|}t,\bf{B},\varGamma ),F({r}_{2}{|}t,\bf{B},\varGamma ){|}t,\bf{B},\varGamma )$, where $F(\bullet )$ denotes the cumulative density function computed from integrating over the density function. To compute this copula, we assumed that the conditional copula density is independent from the value of the movement variables at each time point, that is, $c(F({r}_{1}{|}t,\bf{B},\varGamma ),F({r}_{2}{|}t,\bf{B},\varGamma ){|}t,\bf{B},\varGamma )\approx c(F({r}_{1}{|}t,\bf{B},\varGamma ),F({r}_{2}{|}t,\bf{B},\varGamma ){|}t,\varGamma )$, also noted as simplifying assumption. We note, however, that this quantity still depends on the average shared cotuning of the two neurons to movement variables through the single-neuron cumulative density functions. This copula is estimated by pooling the values of conditioned probability functions at each time point for a given trial type.

To compute each of the conditional marginal densities $f({r}_{i}|t,\bf{B},\varGamma )$ of activity of neuron i conditional on time within the trial, behavioral variables and trial type, we used the single-neuron NPvC models we fitted for single neurons and the following identity:

$$f\left({r}_{i}|t,\bf{B},\varGamma \right)=\frac{f\left({r}_{i},t,\bf{B}|\varGamma \right)}{f\left(t,\bf{B}|\varGamma \right)}\,\mathrm{for}\,i=1,2$$

(15)

The density functions $f({r}_{i},t,\bf{B}|\varGamma )$ and $f(t,\bf{B}|\varGamma )$ were numerically estimated as explained in ‘NPvC models of single-neuron activity’. All these density functions were estimated using the same fivefold cross-validation method explained in ‘NPvC models of single-neuron activity’.

After computing the density functions of equations (7) and (9), we used the same decoding approach as for single-neuron information values (‘Estimation of Mutual information for single neurons’) to compute the information values in the right-hand side of equation (12), ${I}^{\mathrm{pairwise}}({r}_{1},{r}_{2}{;}C{|}t,\bf{B})$ and ${I}^{\mathrm{ind}}({r}_{1},{r}_{2}{;}C{|}t,\bf{B})$ at any time point. The only difference compared to the single-neuron information computation is that, to compute the decoded task variable $\hat{C}$ (which is used later to compute the confusion matrix), we used the pairwise density functions (equations (14) and (15)) instead of single-neuron density functions. Although in the main text we use the mutual information in the decoder’s confusion matrix to compute information due to its suitability to assess the role of independent versus correlated neural activity, we also computed the NpVC decoder’s performance as fraction of correct decoding, obtaining similar results (Extended Data Fig. 5g).

We labeled each pair of neurons as IL, independent or IE if the mean of the interaction information for the pair in the first 2 s after the test cue onset was significantly negative, not different from zero, or significantly positive, respectively. Of note, the nomenclature we used for IL is similar to that of other recent studies^7,25. This naming simply denotes that correlations decrease information and is less restrictive than the meaning given to this term in some work⁷⁹. We computed the statistical significance of the sign of pairwise interaction information by building a bootstrap distribution of the mean interaction values (1,000 bootstraps) and computed a P value using a signed t test for the probability that the mean of this distribution is either positive (P < 0.01, signed t test), negative (P < 0.01, signed t test) or indistinguishable from zero (P > 0.01, t test) after applying a Holm–Bonferroni multiple comparison correction over all pairs⁸⁰. In total, we had 145,439 nonlabeled pairs and 1,355 same-target projection pairs.

Triplet statistics and global clustering coefficient

To compute the triplet probabilities presented in Fig. 5, we counted the number of each triplet type in each session and divided this count by the total triplet count (in total n = 1,080,382 triplets of nonlabeled neurons and n = 30,204 triplets of same-target projections neurons). We estimated the statistical s.e. of the mean for these probabilities using bootstrapping (with 100 bootstrap subsamples). Because we were interested in deviations of triple probabilities from the values expected in networks with randomly structured interactions, for each bootstrap subsample, we computed an estimate of the probability of each triplet type in a random network by shuffling the pairwise interactions across neurons within each session. Notably, this shuffling did not change the total number of IL, IE and independent pairs within the network. The relative triplet probability was computed as the difference between the real triplet probabilities and the shuffled probability, averaged over shuffles/bootstraps.

We also computed global clustering coefficients^42,43 to detect the presence of subgraph clusters of IL or IE pairwise interactions within the network. We computed the global clustering coefficient separately for subgraphs of only IL and IE pairwise interactions. To compute the IL (IE) global clustering coefficient, we first defined a binary graph in which neuron pairs have an edge if the pair has a significant value of IL (IE) interaction information. If the pair does not have significant interaction information, this pair does not have an edge. For each of these two graphs separately, we then counted the number of closed triplets (triplets with edges between all three neurons) and open triplets (with only two edges between the three neurons) and computed a clustering coefficient (CC) as follows^42,43:

$$\mathrm{CC}=\frac{\mathrm{number}\,\mathrm{of}\,\mathrm{closed}\,\mathrm{triplets}}{\mathrm{number}\,\mathrm{of}\,\mathrm{closed}\,\mathrm{triplets}+\mathrm{number}\,\mathrm{of}\,\mathrm{open}\,\mathrm{triplets}}$$

(16)

The CC ranges between 0 and 1, with larger values corresponding to a graph with a greater number of closed triplets, indicating the existence of clusters of nodes that are more connected within themselves compared to other nodes. A graph with a cluster of nodes connected (similar to the middle panel of Fig. 5a) will have few open triplets because any selection of three nodes will result in either a closed triplet (when the three nodes are selected from the cluster) or will have only one edge, and thus using equation (16), the clustering coefficient will be close to or equal to one. We also computed a relative clustering coefficient as the difference between the clustering coefficient computed on data and the one computed on a randomly organized network, using the shuffling procedure described above. Statistics were assessed using bootstrapping, as explained above, for the triplet probabilities.

Of note, the unlabeled cells could include some triplets all projecting to a downstream area that was not labeled with the retrograde tracers, and finding a null relative clustering coefficient for the unlabeled cells does not rule out that there could be some subnetwork that has some random structure, but at the whole network level this is washed out because there are subnetworks with oppositive deviations from randomness. We, however, verified this with computations of the clustering coefficient and/or plots of triplet distributions, showing that the nonlabeled cells are random, even when partitioned in terms of the sign of interaction information (Fig. 5d) or in terms of choice preference (Fig. 5j,k).

Parametric two-pool model of network structure

To better understand the relation between network structure and triplet probabilities, we considered a simple parametric model for the network structure within the framework of well-studied approach of network clustering modeling known as planted partition model or stochastic block model⁸¹. We considered a network with N nodes (corresponding to single neurons) that are connected with either a positive interaction with probability $P\left(+\right)$ or a negative interaction with probability $P\left(-\right)=1-P\left(+\right)$. We considered a simple structure consisting of two pools of size $n/2$, with each pool having different positive and negative probabilities of $\left({P}_{1}\left(+\right),{P}_{1}\left(-\right)\right)=\left(P\left(+\right)+\delta {P}_{1},P-\delta {P}_{1}\right)$ and $\left({P}_{2}\left(+\right),{P}_{2}\left(-\right)\right)=\left(P\left(+\right)+\delta {P}_{2},P-\delta {P}_{2}\right)$, where $\delta {P}_{1}$ and $\delta {P}_{2}$ are the two parameters that express the difference of IE probabilities within each pool relative to the average IE probability in the full population. Each value of $\left(\delta {P}_{1},\delta {P}_{2}\right)$ defines a network structure. A network with a uniform distribution will have $\left(\delta {P}_{1}=0,\delta {P}_{2}=0\right)$, while other combinations of $\left(\delta {P}_{1},\delta {P}_{2}\right)$ generate two-pool networks with extra IL or IE interactions within the pools (as shown in Fig. 5f). Considering that the network is fully connected, we computed analytically the probabilities of the edges connecting the two pools in terms of $\left(\delta {P}_{1},\delta {P}_{2}\right)$ for fixed values of $\left(P\left(+\right),P\left(-\right)\right)$, as follows:

$$\begin{array}{l}{P}_{B}\left(+\right)=2\frac{P\left(+\right)n\left(n-1\right)-{P}_{1}\left(+\right)\frac{n}{2}\left(\frac{n}{2}-1\right)-{P}_{2}\left(+\right)\frac{n}{2}\left(\frac{n}{2}-1\right)}{{n}^{2}}\\ {P}_{B}\left(-\right)=1-{P}_{B}\left(+\right)\end{array}$$

(17)

Because we specified the full probability of connections between the two pools in the model, we can also analytically compute the probabilities of each triplet type, both for the structured and shuffled networks, as described in the Supplementary Note (‘Analytical calculation of triplet probabilities in the two-pool model of network structure’).

Motif expansion of population information

To understand how the structure of interactions among different neurons contributes to population coding, we computed an analytical expansion of the information carried by the activity of a population of neurons about a task variable as a function of the network motifs in which interaction information between pairs of neurons is structured. As is often done in models of population coding^82,83, we assumed that neural activity for a fixed trial type follows a multivariate Gaussian distribution. Furthermore, as we found in our data, when computing the expansion, we assumed that the population had a finite size, that single-neuron information values were small and that pairwise noise correlations between the neurons were small, which (as we will show) in turn implies that pairwise interaction information values are smaller than single-neuron information values (again, as found in our data). With these assumptions, we approximated the information carried by the population as a sum of terms that can be directly related to the structure of interaction information within the network. The detailed information on this expansion and assumptions is presented in Supplementary Note (‘Analytical expansion of the population information’).

We first expressed, as is often done in studies of population codes and it follows from extending to arbitrary population size the concept of pairwise interaction information in equation (12), the full population information as the sum of the following two components: the independent information ${I}^{{\rm{ind}}}$, which is the information of a population with the same single-neuron information values as found in the data, but with no interaction between them (where interaction is defined as noise correlation, that is, a statistical relationship between the neurons at fixed trial type), and the interaction information ${I}^{\mathrm{int}}$, defined as the difference between the real and the independent population information:

$${I}^{\,\mathrm{pop}}={I}^{\,\mathrm{ind}}+{I}^{\,\mathrm{int}}$$

(18)

The independent information ${I}^{{\rm{ind}}}$ is a function of only the single-neuron information values (equation (21) in the Supplementary Note). The interaction information quantifies the total effect of noise correlations on the stimulus information carried by the population. We decomposed the population interaction information into a sum of different components expressing the contribution of different graph motifs of the interaction network (a graph in which the edges correspond to interaction information values). As a result, the interaction information is a function of the single-neuron information values, the interaction information values and the number (probability) of triplet motifs (equation (31) in the Supplementary Note). To assess the total contribution to information representation of network structure (triplet-wise arrangements of pairwise links of interaction information) with respect to an unstructured, random arrangement of the same links within the network, we then rearranged this expansion to break down interaction information into the two following components:

$${I}^{\,\mathrm{int}}={I}^{\,\mathrm{unstructured}}+{I}^{\,\mathrm{structured}}$$

(19)

These two terms are calculated (Supplementary Note ‘The population information component corresponding to the network organization of pairwise interactions’), and their equations are reported in equations (32) and (33) in the Supplementary Note. The unstructured interaction information component is the information in a population with the same distribution of pairwise interactions as the data, but randomly rearranged across neurons by shuffling (thus, no network structure), minus the independent information. This component quantifies the contribution of the distribution of pairwise interactions to the population information. The structured interaction information component is defined as the difference between the total population information of the original network and the population information in an unstructured network. Thus, this component isolates the contribution of the structure of pairwise interactions in the network.

To compute the unstructured and structured triplet interaction information, we used equation (31), as given in the Supplementary Note, after determining the number and weights of different motifs in both the real and shuffled networks (Supplementary Note ‘The population information component corresponding to the network organization of pairwise interactions’). For calculating information, we used an expansion of population information that is also valid for nonsymmetric networks (Extended Data Fig. 8 and Supplementary Note ‘Simplifying the structured interaction information’). However, in Fig. 6b–d, we illustrate this expansion for the simple case of a symmetric network, which accurately describes our empirical data. In fact, we showed in the Supplementary Note (‘Analytical calculation of triplet probabilities in the two-pool model of network structure’) that, in the case of a symmetric two-pool network, these computations can be greatly simplified because of the symmetries present in the network. In the symmetric case, the only motif contributing to the structure component of the interaction information is the closed triplet (Supplementary Note—Analytical expansion of the population information) and its associated triplet probabilities and weights. As shown in Fig. 6c, we can define four different types of triplets with varying combinations of IE and IL interaction links. Two of these motifs with even number of IL (‘+,+,+’, ‘+,−,−’) contribute positively to the structured interaction information and the other two with odd number of IL interaction links (‘+,+,−’, ‘−,−,−’) contribute negatively. The strength of contribution of the triplets to the structured interaction information is given by the deviations from the shuffled network level of the probability of each triplet type (computed as described in ‘Triplet statistics and global clustering coefficient’), multiplied by a non-negative numerical coefficient that is function of the single-neuron and pairwise interaction information values that shape the triplets. Equations for these terms are reported in equations (46) and (47), as given in the Supplementary Note. As shown through analytical calculations in equation (31), as given in the Supplementary Note, one important property of these three terms is the difference in how they scale with population size. While the independent information scales linearly, the unstructured and structured interaction information terms scale faster with second and third orders of population size, respectively. We did not include the spatial arrangement of neurons in these calculations because we did not find strong anatomical clustering of neurons projecting to the same target, and because the differences in correlations between nonlabeled neurons and neurons projecting to the same target were not strongly dependent on anatomical distance (Extended Data Fig. 5c).

Statistics and reproducibility

No statistical method was used to predetermine the sample size. Data were excluded from the analyses based on the subjective assessments of imaging quality at the time of acquisition and before data analysis. All data were acquired before data analysis. Blinding and randomization of groups were not applicable. Nonparametric statistical methods were applied throughout with Holm–Bonferroni correction for multiple comparisons. Sample sizes, statistical tests and P values are listed in Supplementary Table 2.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Data are available at https://doi.org/10.7910/DVN/NL1CAQ.

Code availability

Code for the NPvC is available at https://github.com/houman1359/Triplets.

References

Wang, X.-J. Theory of the multiregional neocortex: large-scale neural dynamics and distributed cognition. Annu. Rev. Neurosci. 45, 533–560 (2022).
Article CAS PubMed PubMed Central Google Scholar
Kohn, A. et al. Principles of corticocortical communication: proposed schemes and design considerations. Trends Neurosci. 43, 725–737 (2020).
Article CAS PubMed PubMed Central Google Scholar
Petreanu, L. et al. Activity in motor-sensory projections reveals distributed coding in somatosensation. Nature 489, 299–303 (2012).
Article CAS PubMed PubMed Central Google Scholar
Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207–214 (2014).
Article CAS PubMed PubMed Central Google Scholar
Winnubst, J. et al. Reconstruction of 1,000 projection neurons reveals new cell types and organization of long-range connectivity in the mouse brain. Cell 179, 268–281 (2019).
Article CAS PubMed PubMed Central Google Scholar
Han, Y. et al. The logic of single-cell projections from visual cortex. Nature 556, 51–56 (2018).
Article CAS PubMed PubMed Central Google Scholar
Panzeri, S., Moroni, M., Safaai, H. & Harvey, C. D. The structures and functions of correlations in neural population codes. Nat. Rev. Neurosci. 23, 551–567 (2022).
Article CAS PubMed Google Scholar
Koren, V., Andrei, A. R., Hu, M., Dragoi, V. & Obermayer, K. Pairwise synchrony and correlations depend on the structure of the population code in visual cortex. Cell Rep. 33, 108367 (2020).
Article CAS PubMed Google Scholar
Nigam, S., Pojoga, S. & Dragoi, V. Synergistic coding of visual information in columnar networks. Neuron 104, 402–411 (2019).
Article CAS PubMed PubMed Central Google Scholar
Montijn, J. S., Meijer, G. T., Lansink, C. S. & Pennartz, C. M. A. Population-level neural codes are robust to single-neuron variability from a multidimensional coding perspective. Cell Rep. 16, 2486–2498 (2016).
Article CAS PubMed Google Scholar
Hu, Y., Zylberberg, J. & Shea-Brown, E. The sign rule and beyond: boundary effects, flexibility, and noise correlations in neural population codes. PLoS Comput. Biol. 10, e1003469 (2014).
Article PubMed PubMed Central Google Scholar
Harvey, C. D., Coen, P. & Tank, D. W. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature 484, 62–68 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hanks, T. D. et al. Distinct relationships of parietal and prefrontal cortices to evidence accumulation. Nature 520, 220–223 (2015).
Article CAS PubMed PubMed Central Google Scholar
Licata, A. M. et al. Posterior parietal cortex guides visual decisions in rats. J. Neurosci. 37, 4954–4966 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Article CAS PubMed Google Scholar
Freedman, D. J. & Ibos, G. An integrative framework for sensory, motor, and cognitive functions of the posterior parietal cortex. Neuron 97, 1219–1234 (2018).
Article PubMed PubMed Central Google Scholar
Whitlock, J. R., Sutherland, R. J., Witter, M. P., Moser, M.-B. & Moser, E. I. Navigating from hippocampus to parietal cortex. Proc. Natl Acad. Sci. USA 105, 14755–14762 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hwang, E. J., Dahlen, J. E., Mukundan, M. & Komiyama, T. History-based action selection bias in posterior parietal cortex. Nat. Commun. 8, 1242 (2017).
Article PubMed PubMed Central Google Scholar
Minderer, M., Brown, K. D. & Harvey, C. D. The spatial structure of neural encoding in mouse posterior cortex during navigation. Neuron 102, 232–248 (2019).
Article CAS PubMed PubMed Central Google Scholar
Tseng, S.-Y., Chettih, S. N., Arlt, C., Barroso-Luque, R. & Harvey, C. D. Shared and specialized coding across posterior cortical areas for dynamic navigation decisions. Neuron 110, 2484–2502 (2022).
Article CAS PubMed PubMed Central Google Scholar
Meister, M. L. R., Hennig, J. A. & Huk, A. C. Signal multiplexing and single-neuron computations in lateral intraparietal area during decision-making. J. Neurosci. 33, 2254–2267 (2013).
Article CAS PubMed PubMed Central Google Scholar
Raposo, D., Kaufman, M. T. & Churchland, A. K. A category-free neural population supports evolving demands during decision-making. Nat. Neurosci. 17, 1784–1792 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zingg, B. et al. Neural networks of the mouse neocortex. Cell 156, 1096–1111 (2014).
Article CAS PubMed PubMed Central Google Scholar
Runyan, C. A., Piasini, E., Panzeri, S. & Harvey, C. D. Distinct timescales of population coding across cortex. Nature 548, 92–96 (2017).
Article CAS PubMed PubMed Central Google Scholar
Valente, M. et al. Correlations enhance the behavioral readout of neural population activity in association cortex. Nat. Neurosci. 24, 975–986 (2021).
Article CAS PubMed PubMed Central Google Scholar
Morcos, A. S. & Harvey, C. D. History-dependent variability in population dynamics during evidence accumulation in cortex. Nat. Neurosci. 19, 1672–1681 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kira, S., Safaai, H., Morcos, A. S., Panzeri, S. & Harvey, C. D. A distributed and efficient population code of mixed selectivity neurons for flexible navigation decisions. Nat. Commun. 14, 2121 (2023).
Article CAS PubMed PubMed Central Google Scholar
Driscoll, L. N., Pettit, N. L., Minderer, M., Chettih, S. N. & Harvey, C. D. Dynamic reorganization of neuronal activity patterns in parietal cortex. Cell 170, 986–999 (2017).
Article CAS PubMed PubMed Central Google Scholar
Arlt, C., et al. Cognitive experience alters cortical involvement in goal-directed navigation. eLife 11, e76051 (2022).
Article CAS PubMed PubMed Central Google Scholar
Pinto, L. et al. Task-dependent changes in the large-scale dynamics and necessity of cortical regions. Neuron 104, 810–824 (2019).
Article CAS PubMed PubMed Central Google Scholar
Musall, S., Kaufman, M. T., Juavinett, A. L., Gluf, S. & Churchland, A. K. Single-trial neural dynamics are dominated by richly varied movements. Nat. Neurosci. 22, 1677–1686 (2019).
Article CAS PubMed PubMed Central Google Scholar
Joe, H. Dependence Modeling with Copulas (CRC Press, 2014).
Nelsen, R. B. An Introduction to Copulas (Springer Science & Business Media, 2007).
Aas, K., Czado, C., Frigessi, A. & Bakken, H. Pair-copula constructions of multiple dependence. Insur. Math. Econ. 44, 182–198 (2009).
Article Google Scholar
Safaai, H., Onken, A., Harvey, C. D. & Panzeri, S. Information estimation using nonparametric copulas. Phys. Rev. E 98, 053302 (2018).
Article CAS PubMed PubMed Central Google Scholar
Park, I. M., Meister, M. L. R., Huk, A. C. & Pillow, J. W. Encoding and decoding in parietal cortex during sensorimotor decision-making. Nat. Neurosci. 17, 1395–1403 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pillow, J. W. et al. Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature 454, 995–999 (2008).
Article CAS PubMed PubMed Central Google Scholar
Averbeck, B. B., Latham, P. E. & Pouget, A. Neural correlations, population coding and computation. Nat. Rev. Neurosci. 7, 358–366 (2006).
Article CAS PubMed Google Scholar
Panzeri, S., Schultz, S. R., Treves, A. & Rolls, E. T. Correlations and the encoding of information in the nervous system. Proc. Biol. Sci. 266, 1001–1012 (1999).
Article CAS PubMed PubMed Central Google Scholar
Abbott, L. F. & Dayan, P. The effect of correlated variability on the accuracy of a population code. Neural Comput. 11, 91–101 (1999).
Article CAS PubMed Google Scholar
Ruff, D. A. & Cohen, M. R. Attention can either increase or decrease spike count correlations in visual cortex. Nat. Neurosci. 17, 1591–1597 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H., Kim, P.-J. & Jeong, H. Statistical properties of sampled networks. Phys. Rev. E 73, 016102 (2006).
Article Google Scholar
Luce, R. D. & Perry, A. D. A method of matrix analysis of group structure. Psychometrika 14, 95–116 (1949).
Article CAS PubMed Google Scholar
Azeredo da Silveira, R. & Rieke, F. The geometry of information coding in correlated neural populations. Annu. Rev. Neurosci. 44, 403–424 (2021).
Article CAS PubMed Google Scholar
Schneidman, E., Bialek, W. & Berry, M. J. 2nd Synergy, redundancy, and independence in population codes. J. Neurosci. 23, 11539–11553 (2003).
Article CAS PubMed PubMed Central Google Scholar
Salinas, E. & Sejnowski, T. J. Correlated neuronal activity and the flow of neural information. Nat. Rev. Neurosci. 2, 539–550 (2001).
Article CAS PubMed PubMed Central Google Scholar
Brown, S. P. & Hestrin, S. Intracortical circuits of pyramidal neurons reflect their long-range axonal targets. Nature 457, 1133–1136 (2009).
Article CAS PubMed PubMed Central Google Scholar
Semedo, J. D., Zandvakili, A., Machens, C. K., Yu, B. M. & Kohn, A. Cortical areas interact through a communication subspace. Neuron 102, 249–259 (2019).
Article CAS PubMed PubMed Central Google Scholar
Semedo, J. D., Gokcen, E., Machens, C. K., Kohn, A. & Yu, B. M. Statistical methods for dissecting interactions between brain areas. Curr. Opin. Neurobiol. 65, 59–69 (2020).
Article CAS PubMed PubMed Central Google Scholar
Schneidman, E., Berry, M. J. 2nd, Segev, R. & Bialek, W. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature 440, 1007–1012 (2006).
Article CAS PubMed PubMed Central Google Scholar
Montani, F. et al. The impact of high-order interactions on the rate of synchronous discharge and information transmission in somatosensory cortex. Philos. Trans. A 367, 3297–3310 (2009).
Google Scholar
Minces, V., Pinto, L., Dan, Y. & Chiba, A. A. Cholinergic shaping of neural correlations. Proc. Natl Acad. Sci. USA 114, 5725–5730 (2017).
Article CAS PubMed PubMed Central Google Scholar
Jeanne, J. M., Sharpee, T. O. & Gentner, T. Q. Associative learning enhances population coding by inverting interneuronal correlation patterns. Neuron 78, 352–363 (2013).
Article CAS PubMed PubMed Central Google Scholar
Stringer, C. et al. Spontaneous behaviors drive multidimensional, brainwide activity. Science 364, 255 (2019).
Article PubMed PubMed Central Google Scholar
Itokazu, T. et al. Streamlined sensory motor communication through cortical reciprocal connectivity in a visually guided eye movement task. Nat. Commun. 9, 338 (2018).
Article PubMed PubMed Central Google Scholar
Li, N., Daie, K., Svoboda, K. & Druckmann, S. Robust neuronal dynamics in premotor cortex during motor planning. Nature 532, 459–464 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hwang, E. J. et al. Corticostriatal flow of action selection bias. Neuron 104, 1126–1140 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chen, J. L., Carta, S., Soldado-Magraner, J., Schneider, B. L. & Helmchen, F. Behaviour-dependent recruitment of long-range projection neurons in somatosensory cortex. Nature 499, 336–340 (2013).
Article CAS PubMed Google Scholar
Condylis, C. et al. Context-dependent sensory processing across primary and secondary somatosensory cortex. Neuron 106, 515–525 (2020).
Article CAS PubMed PubMed Central Google Scholar
Economo, M. N. et al. Distinct descending motor cortex pathways and their roles in movement. Nature 563, 79–84 (2018).
Article CAS PubMed Google Scholar
Ferraina, S., Paré, M. & Wurtz, R. H. Comparison of cortico-cortical and cortico-collicular signals for the generation of saccadic eye movements. J. Neurophysiol. 87, 845–858 (2002).
Article PubMed Google Scholar
Kauvar, I. V. et al. Cortical observation by synchronous multifocal optical sampling reveals widespread population encoding of actions. Neuron 107, 351–367 (2020).
Article CAS PubMed PubMed Central Google Scholar
Paré, M. & Wurtz, R. H. Monkey posterior parietal cortex neurons antidromically activated from superior colliculus. J. Neurophysiol. 78, 3493–3497 (1997).
Article PubMed Google Scholar
Aronov, D. & Tank, D. W. Engagement of neural circuits underlying 2D spatial navigation in a rodent virtual reality system. Neuron 84, 442–456 (2014).
Article CAS PubMed PubMed Central Google Scholar
Guizar-Sicairos, M., Thurman, S. T. & Fienup, J. R. Efficient subpixel image registration algorithms. Opt. Lett. 33, 156–158 (2008).
Article PubMed Google Scholar
Greenberg, D. S. & Kerr, J. N. D. Automated correction of fast motion artifacts for two-photon imaging of awake animals. J. Neurosci. Methods 176, 1–15 (2009).
Article PubMed Google Scholar
Friedrich, J., Zhou, P. & Paninski, L. Fast online deconvolution of calcium imaging data. PLoS Comput. Biol. 13, e1005423 (2017).
Article PubMed PubMed Central Google Scholar
Botev, Z. I., Grotowski, J. F. & Kroese, D. P. Kernel density estimation via diffusion. Ann. Statist. 38, 2916–2957 (2010).
Article Google Scholar
Nagler, T., Schellhase, C. & Czado, C. Nonparametric estimation of simplified vine copula models: comparison of methods. Depend. Modeling 5, 99–120 (2017).
Article Google Scholar
Myers, R. H., Montgomery, D. C., Geoffrey Vining, G. & Robinson, T. J. Generalized Linear Models: With Applications in Engineering and the Sciences (John Wiley & Sons, 2012).
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).
Article Google Scholar
Quian Quiroga, R. & Panzeri, S. Extracting information from neuronal populations: information theory and decoding approaches. Nat. Rev. Neurosci. 10, 173–185 (2009).
Article CAS PubMed Google Scholar
Victor, J. D. & Purpura, K. P. Nature and precision of temporal coding in visual cortex: a metric-space analysis. J. Neurophysiol. 76, 1310–1326 (1996).
Article CAS PubMed Google Scholar
Panzeri, S., Senatore, R., Montemurro, M. A. & Petersen, R. S. Correcting for the sampling bias problem in spike train information measures. J. Neurophysiol. 98, 1064–1072 (2007).
Article PubMed Google Scholar
Kuan, A. T. et al. Synaptic wiring motifs in posterior parietal cortex support decision-making. Nature 627, 367–373 (2024).
Article CAS PubMed PubMed Central Google Scholar
Evans, M., Hastings, N. & Peacock, B. Statistical Distributions (Wiley, 2000).
Sorochynskyi, O., Deny, S., Marre, O. & Ferrari, U. Predicting synchronous firing of large neural populations from sequential recordings. PLoS Comput. Biol. 17, e1008501 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pola, G., Thiele, A., Hoffmann, K. P. & Panzeri, S. An exact method to quantify the information transmitted by different mechanisms of correlational coding. Network 14, 35–60 (2003).
Article CAS PubMed Google Scholar
Moreno-Bote, R. et al. Information-limiting correlations. Nat. Neurosci. 17, 1410–1417 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hommel, G. A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75, 383–386 (1988).
Article Google Scholar
Holland, P. W., Laskey, K. B. & Leinhardt, S. Stochastic blockmodels: first steps. Soc. Netw. 5, 109–137 (1983).
Article Google Scholar
Kohn, A., Coen-Cagli, R., Kanitscheider, I. & Pouget, A. Correlations and neuronal population information. Annu. Rev. Neurosci. 39, 237–256 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kafashan, M. et al. Scaling of sensory information in large neural populations shows signatures of information-limiting correlations. Nat. Commun. 12, 473 (2021).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank members of the Harvey lab and V. Koren for the helpful discussions. This work was supported by NIH (grants DP1 MH125776, R01 NS089521, R01 NS108410, R01 NS143140), the German Federal Ministry of Education and Research (BMBF 01GQ2404) and the EU Marie Sklodowska-Curie Action (101152984). The funders had no role in study design, data collection and analysis, and decision to publish or preparation of the manuscript.

Author information

Houman Safaai
Present address: Kempner Institute for Biological and Artificial Intelligence, Harvard University, Boston, MA, USA

Authors and Affiliations

Department of Neurobiology, Harvard Medical School, Boston, MA, USA
Houman Safaai, Alice Y. Wang, Shinichiro Kira & Christopher D. Harvey
Institute for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
Simone Blanco Malerba & Stefano Panzeri

Authors

Houman Safaai
View author publications
Search author on:PubMed Google Scholar
Alice Y. Wang
View author publications
Search author on:PubMed Google Scholar
Shinichiro Kira
View author publications
Search author on:PubMed Google Scholar
Simone Blanco Malerba
View author publications
Search author on:PubMed Google Scholar
Stefano Panzeri
View author publications
Search author on:PubMed Google Scholar
Christopher D. Harvey
View author publications
Search author on:PubMed Google Scholar

Contributions

H.S., A.Y.W., S.P. and C.D.H. conceived the study. A.Y.W. performed all experiments and processed the data. H.S. performed all data analysis and developed the analysis methods, information expansions and models. S.B.M. contributed to computing information expansions. C.D.H. oversaw the experiments. S.P. and C.D.H. oversaw data analyses and the development of analysis and modeling methods. S.K. provided input on data processing and analysis. H.S., S.P. and C.D.H. wrote the paper with input from S.K. and S.B.M.

Corresponding authors

Correspondence to Houman Safaai, Stefano Panzeri or Christopher D. Harvey.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Neuroscience thanks Valentin Dragoi, Joel Zylberberg, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Retrograde labeling and PPC activity sequences.

a, Sorted behavioral performance in all sessions computed as either the fraction of trials with correct decisions (left y-axis) or the information needed to achieve this performance computed as the mutual information between the reward direction and animal’s choice (right y-axis; the relationship between fraction correct and information is computed for each value of performance on the left y-axis using methods explained at the end of ‘Estimation of Mutual information for single neurons’). The dashed line represents the mean performance level and its corresponding mutual information value. Each color represents individual mouse. b, Behavioral performance presented for each mouse and each session. Each line represents a single mouse, and sessions are sorted from lowest to highest performance. c, The fraction of error trials in each epoch of session (each session is binned into 10 epochs from start to the end). d, Injection sites for retrograde tracers in three different target areas to label PPC projection cells. e, Distribution of pairwise distances for pairs of non-labeled neurons or pairs of neurons projecting to the same area. f, The mean pairwise distance between different all pair combinations of non-labeled or projecting neurons. Error bars are the s.e.m. computed over all possible pairs of simultaneously recorded cells. g, Example two-photon imaging field of view for three channels with different colors and the merged image. Images are 200 μm on each side. Data from 10 mice and 59 experimental sessions were collected and used. h, Similar to Fig. 1d, normalized deconvolved calcium activity of PPC neurons sorted based on the cross-validated peak time, for populations of non-labeled cells and cells projecting to contralateral PPC, RSC, or ACC. Vertical gray lines represent onsets of sample cue, delay, test cue, turn to arms, and reward. Gray dashed lines correspond to 1 s before turn and 0.5 s before reward.

Extended Data Fig. 2 NPvC goodness of fit for different cell types.

a, Schematic of the c-vine copula sequential computation of four-dimensional data. The multivariate copula density function is computed from the product of a series of sequential bivariate copulas, which are being computed as it is explained in Supplementary Note ‘Vine copula modeling of neural responses’. b, Distribution of virtual heading direction, lateral velocity, lateral acceleration, forward velocity, and forward acceleration during the trial for different trial types. c, The cumulative distribution of goodness of fit (fraction of deviance explained) for NPvC model for different populations. d, Mutual information between task variables (sample cue, test cue, choice, and reward direction) and behavioral variables (virtual heading direction, lateral velocity and acceleration, and forward velocity and acceleration). In b and d, lines and shaded areas plot mean and s.e.m. over experimental sessions.

Extended Data Fig. 3 Single-neuron information in PPC neurons.

a, Time course of different information components in PPC neurons, similar to Fig. 3a, divided into non-labeled neurons and neurons projecting to cPPC, RSC, or ACC. Lines and shaded areas plot mean and s.e.m. over neurons. b, Distribution of single neuron information values about choice for different groups of neurons (same color code as a).

Extended Data Fig. 4 Simulation comparing performance of vine copula and GLM.

a, The log likelihood of the data given GLM and NPvC models. b, Fit quality, defined as the fraction of deviance explained (FDE) across neurons for the GLM and the NPvC model. c, Single neuron conditional information $I({r;S|B})$ computed for two different behavioral dependency functions (linear, left; quadratic, right) of neural activity as a function of their weight $(\alpha )$. Ground truth computed decoding directly the non-movement-modulated component of neural activity (black), information estimated through GLM (purple) and NpVC (orange) (for details, see Supplementary Note ‘Comparison of the performance of NPvC and GLM on simulated neural population data’). d, Similar to a but for the interaction information of neurons pairs. e, Similar to a but for the probability of (+,+,+) triplets computed over a set of simulations for each α. In all the panels (a–e), error bars are computed as the s.e.m. over 1,000 simulations. f, Triplet probabilities for α = 0, and α = 2. In all panels, we generated 1,000 simulations for each value of α.

Extended Data Fig. 5 Pairwise noise correlation and interaction information for distinct projection targets.

a, Noise correlations (left) and conditional noise correlations (right) computed for pairs of neurons in correct (filled bars) and incorrect trials (open bars). Similar to Fig. 4b, for different projection targets. b, Similar to a but using absolute values of noise correlation. c, Similar to a but for pairs with distances less than 150 μm. d, Interaction information for sample cue, test cue and choice/reward direction computed for non-labeled neurons and pairs of neurons projecting to same area and using correct trials. e, Average absolute interaction information for information-enhancing (IE), information-limiting (IL), and independent (IN) pairs. In all panels, means are from all neurons, and error bars are the standard error of mean estimated by bootstrapping. f, NPvC decoder’s performance as fraction of correct decoding for single neurons after subtracting the trial shuffled value. g, The fraction-correct analog of interaction information computed as the difference in fraction-correct decoding obtained with the NPvC joint models trained with and the presence of pairwise noise correlations. h, Interaction information for pairs of non-labeled neurons and pairs of neurons projecting to the same area for correct and incorrect trials computed after destroying the neuronal correlations by shuffling trial order at fixed trial type. In a–g, the values and error bars shown are mean and s.e.m. computed using all the simultaneously recorded cells. i, The probabilities of information-enhancing (IE), information-limiting (IL) and independent (IN) pairs of neurons for choice/reward direction using correct trials. The values and error bars shown are the mean and s.e.m. of probabilities estimated from bootstrapping. j, The distributions of entropy extracted from the correlation structures between pairs of neurons, computed either using the NPvC model (black), the analytic formula for a Gaussian distribution with linear correlation computed from each pair (cyan), or from samples generated from the Gaussian distribution with the same correlation value but computed using the nonparametric copula (green) for control. The result shows a significant positive shift in the value of the entropy extracted using the nonparametric copula, compared to a Gaussian distribution capturing only the linear correlation structure of the data. This result suggests that there are neural correlations that have significant non-linearity. Thus, using the copula can help to capture these correlations between pairs of neurons beyond just the linear component. This suggests that by using the copula to quantify the correlations and interactions between neurons, we potentially include part of entropy of pairwise interaction that would have been ignored if we used a simple Gaussian quantification for the pairwise correlation. This component of the entropy can potentially contribute to the interaction information for a neural pair. This result suggests the importance of considering correlation components beyond the Gaussian linear correlation. k, The mean interaction information for each combination of signal correlation and noise correlation signs for neurons projecting to the same target during correct trials. The four quadrants correspond to the combinations of positively signed (+) and negatively signed (−) noise and signal correlations. l, The correlation coefficient between the pairwise interaction information and two features of noise correlations. The first feature is the signal-noise correlation similarity coefficient (left), defined as $J=-{nc}.\mathrm{sgn}\left({sc}\right)$ (where ${nc}$ is the noise correlation and $\mathrm{sgn}\left({sc}\right)$ is the sign of the signal correlation which is +1 for neurons with similar choice selectivity and −1 for pairs with different choice selectivity). This coefficient captures how well aligned signal and noise correlations are. The second feature is the normalized variance across stimuli of the noise correlation (right) defined as $\frac{\mathrm{var}\left({{nc}}_{s}\right)}{{\rm{mean}}\left({{nc}}_{s}\right)}$ where ${{nc}}_{s}$ is the noise correlation for stimulus s. This coefficient captures the strength of the stimulus-dependence of noise correlations. Both signal-noise similarity and stimulus-dependence of noise correlations are factors that can create information-enhancing correlations. These results show that in our dataset, the signal-noise similarity is the main factor creating information-enhancing correlations. In all the panels, * indicates p < 0.05, ** indicates p < 0.01, *** indicates p < 0.001, t-test with Holm–Bonferroni correction. In panels a–f and i, dots and error bars represent mean and s.e.m. over all pairs of non-labeled or projection neurons recorded simultaneously.

Extended Data Fig. 6 Expanded quantification of triplet probabilities.

a, Relative triplet probability with respect to a random network (obtained by shuffling the organization of pairs) for non-labeled and same-target neurons during correct (left two panels) and incorrect (right two panels) trials for sample cue and test cue information. b, Similar to a and Fig. 5c for neurons projecting to different targets. c, Relative triplet probability with respect to a random network for data after shuffling the trial label of each neuronal activity to remove the trial-to-trial correlations between pairs of neurons. In all panels, * indicates p < 0.05, ** indicates p < 0.01, *** indicates p < 0.001, t-test with Holm–Bonferroni correction for statistical multiple comparisons. Dots and error bars indicate mean and s.e.m. computed by bootstrapping over all triplets of neurons.

Extended Data Fig. 7 Structured interaction information for different task variables and projection targets.

a, The mean single-neuron information per motif, pairwise interaction information per motif, and information per motif from open and closed triplets computed for each triplet type using the non-labeled and projecting to the same target PPC neurons. b, Independent, unstructured interaction, and structured interaction information for non-labeled and same-target projection PPC neurons during correct and incorrect trials for increasing population size and for sample cue and test cue information. c, Independent, unstructured interaction, and structured interaction information for non-labeled and same-target projection neurons during correct and incorrect trials for increasing population size, similar to Fig. 6e. In panel a, dots and error bars represent mean and s.e.m. across triplets. In panels b,c, lines (filled for correct trials, dashed for incorrect trials) and shading represent mean and s.e.m. estimated from bootstrapping over triplets. In all panels, * indicates p < 0.05, ** indicates p < 0.01, *** indicates p < 0.001, t-test with Holm–Bonferroni correction for statistical multiple comparisons.

Extended Data Fig. 8 Structured interaction information for non-symmetric two-pool network.

a,b, Schematics representing how, for a non-symmetric network, the population information can be expanded as a sum of contributions from network motifs with increasing complexity, similar to the case of the symmetric networks shown in Fig. 6b,c. In this case, more motifs contribute, such as open triplets. c, In the case in which IPMs are equal for different motif types in a two-pool network, structured interaction information is proportional to the motif probability weights computed for open and closed triplets and the IPM (which is a positive definite number). In the two left plots, the colormap represents the analytic value of the triplet probability weight for open (top) and closed (bottom) triplet motifs for any two-pool network after using the results presented in Supplementary Note ‘Simplifying the structured interaction information’. Depending on the structure of the two-pool network, the structured interaction information can be either information-enhancing (corresponding to areas with purple colors) or information-limiting (corresponding to areas with green colors). The structured interaction information will be a weighted sum of the open and closed triplet terms weighted by their counts and their IPM’s. A simulated example of structured interaction information over the two-pool network space computed using n = 75 neurons and the IPM’s ${{\rm{IPM}}}_{\wedge }=0.001$ and ${{\rm{IPM}}}_{\bigtriangleup }=0.0001$ is shown in the right-hand panel. All panels are computed using the analytic formula presented in equations (44), (45), and (47) in Supplementary Note for a range of two-pool networks. d, Independent, unstructured interaction, and structured interaction information computed for non-labeled (black) and same-target projection (pink) neurons during correct and incorrect trials for increasing population size. Similar to Fig. 6e but without assuming that the network is symmetric. The statistical comparisons were made for values corresponding to n = 75 population size. e, The triplet probability weight (left) and interaction information per triplet (right) for non-labeled and same-target projection neurons during correct and incorrect trials. The statistical comparison in the left panel is to check whether the values are different from zero and in the right panel is to test differences between non-labeled and same-target projection distributions. f, Similar to d but assuming that the IPM weights are the same for all the triplet types and using equations (44), (45), and (47) in Supplementary Note for a symmetric network. Statistical comparisons were made for n = 75 population size. In all panels, * indicates p < 0.05, *** indicates p < 0.001, t-test with Holm–Bonferroni correction. In all panels, mean and s.e.m. were computed by bootstrapping over all triplets of neurons are shown. IPM, interaction per motif.

Extended Data Fig. 9 Results generated with GLM analysis.

a, Average single-neuron information in different populations about different task variables during the first 2 s after sample cue onset, delay onset, or test cue onset computed using GLM. Values are mean and s.e.m. computed over all the neurons recorded. b, Noise correlation computed for pairs of neurons using the residuals estimated by GLM fits to single neuron activities. c, Interaction information estimated using GLM. d, Relative triplet probabilities computed using GLM. In b–d, error bars indicate mean ± s.e.m. across all cell pairs recorded simultaneously. * indicates p < 0.05 and *** indicates p < 0.001, t-test with Holm–Bonferroni correction for statistical multiple comparisons.

Extended Data Fig. 10 Analytical expansion of the population information.

a, Breakdown of interaction information in individual graph motifs. b, Subgraphs with positive and negative contribution to the information for each motif. Red and blue edges in each graph correspond to information-enhancing and information-limiting pairwise interactions, respectively. c, Left, in the two-pool model, the network consists of two different pools of interactions, each of which has a different distribution of information-enhancing and information-limiting. Right, three possible ways of building a ‘+,+,+’ triplet, with their probabilities are shown. d, Probability weights presented in equation (47), as given in Supplementary Note, for open triplets (left box), closed triplets (middle box), and quadruplets (right box) for different network sizes (each column is one network size) and different $\delta P=P\left(+\right)-P\left(-\right)$ (upper row: $P\left(+\right)-P\left(-\right)=0.1$, middle row: $P\left(+\right)-P\left(-\right)=0$, lower row: $P\left(+\right)-P\left(-\right)=-0.1$).

Supplementary information

Supplementary Information

Supplementary Note and Supplementary Table 1.

Reporting Summary

Supplementary Table 2

Sample sizes and statistical values.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Safaai, H., Wang, A.Y., Kira, S. et al. Specialized structure of neural population codes in parietal cortex outputs. Nat Neurosci 28, 2550–2560 (2025). https://doi.org/10.1038/s41593-025-02095-x

Download citation

Received: 10 July 2024
Accepted: 15 September 2025
Published: 31 October 2025
Version of record: 31 October 2025
Issue date: December 2025
DOI: https://doi.org/10.1038/s41593-025-02095-x