Extended Data Fig. 10: Two approaches to cluster the pDMSt neuronal responses in the post-outcome period, and no significant behavioral difference after cued versus uncued success in the post-outcome period.

Panels a-o show neural activity. Panel p shows behavior. Panels a-e show the first approach, a generalized linear model (GLM). Panels f-l show the second approach, tensor regression. This figure analyzes only putative striatal projection neurons (SPNs) (Methods, ‘Identifying putative SPNs’). This figure uses only the training set (half of the data set) to cluster the pDMSt neuronal responses. Fig. 5 uses the other half of the data set (the test set) to decode behavior from neural activity. a, We built a GLM to describe how each neuron’s activity relates to behavior events (Methods, ‘Generalized linear model’). A GLM attempts to use behavior events to predict neural activity. The result is a set of coefficients, or weights, assigned to each neuron for each behavior event. These weights capture the pattern of that neuron’s response to the behavior event. Below “Behavior”, we list the behavior events. Below “GLM” and to the right of each behavior event, we show the resulting GLM coefficients. These are the coefficients averaged across all neurons. 0 s is the time of the behavior event. For “outcome: success”, “outcome: failure”, “cue x success” and “cue x failure”, 0 s is tarm, the moment that the arm is outstretched during the reach. b, Note that the first three GLM coefficients (“cue”, “distractor”, “reach”) are not aligned to the outcome, so we ignored them for subsequent analysis. We took the GLM coefficients after an outcome (“outcome: success”, “outcome: failure”, “cue x success” and “cue x failure”) in the post-outcome period (>0 s, gray shaded area). For each neuron, we made a vector that puts together these 4 sets of coefficients. We call this vector the “outcome profile” of the neuron. Neurons lacking any GLM coefficients greater than zero in the post-outcome period do not have an outcome profile and were excluded. We clustered the outcome profiles of all remaining neurons using k-means clustering. c, The Davies-Bouldin Index (DBI) for different numbers of k-means clusters. Lower values are better. d, The result of k-means clustering for 2 clusters. Each dot is one neuron. top, tSNE of the outcome profiles. bottom, Same tSNE, but here neurons are colored according to which mouse brain contained that neuron. e, GLM coefficients after an outcome. Neurons missing if they did not have any GLM coefficients greater than zero in the post-outcome period (Methods, ‘Clustering the GLM coefficients in the POP’). f, Tensor regression attempts to predict the behavior trial type (cued success, cued failure, uncued success or uncued failure) from the neural activity of all neurons together. Like principal components analysis (PCA), the tensor regression produces multiple components. (See Methods, ‘Setting up the tensor regression’ for more details.) Here we show the trial-averaged activity of all of the neurons sorted by component 1 > component 2 (top row) or component 2 > component 1 (bottom row). Within each row, we further sorted the neurons according to the time delay of the peak response near a cued success. g, Schematic describing tensor regression, i.e., regress behavior trial type against neural activity, then represent the result as a sum of components. Each component is the outer product of 3 rank-1 tensors (more details in Methods, ‘Setting up the tensor regression’). We ran an optimization to find the tensor regression solution (Methods, ‘Tensor regression optimization’). This solution is not unique, so different initial conditions produce different results. h and i summarize the results over multiple optimization runs. h, The joint loading penalty penalizes solutions in which one neuron relies too heavily on more than one component. We chose a solution with a low joint loading penalty, which is a parsimonious solution that loads different components onto different sets of neurons. (See Methods, ‘Choosing a specific tensor regression solution’) i, We tried different numbers of components (Methods, ‘Selecting the rank of the tensor regression’). The 2-component solutions had a loss similar to the more complicated 5-component solutions. Therefore, for simplicity, we selected a 2-component solution. j, Result of the tensor regression. left, Loadings onto neurons for component 1 (purple) versus component 2 (cyan). Note that the two components target largely non-overlapping groups of neurons. middle, Loadings onto timepoints for component 1 (purple) versus component 2 (cyan). right, Loadings onto behavior trial types for component 1 (purple) versus component 2 (cyan). k, To determine whether the tensor regression simply clusters noise, we asked the tensor regression to predict the behavior trial type from the neural activity in the test set (see Methods, ‘Training and test sets’). Results are shown as a confusion matrix for the training (left) and test (right) sets. l, Shuffles accompanying k. top left, Neuron ID shuffle. top right, Timepoints shuffle. bottom left, Shuffle both neuron IDs and timepoints. bottom right, Shuffle behavior trial type. m, Metrics to summarize the post-outcome period GLM coefficients (also see Methods, ‘The simpler approach to the neuron groups 1 and 2’ used in Fig. 5). Sustained after failure is the absolute value of the average coefficient in the time window 1 to 5 s after the time of the arm outstretched, tarm. Modulation index (mod index) is the GLM coefficient average from 2 to 5 s minus the GLM coefficient average from 0 to 2 s after tarm, divided by the sum of these two quantities. n, Response of each neuron summarized by the metrics explained in panel m. Each dot is a neuron. top, Colors are from Cluster 1 (purple) and Cluster 2 (cyan) in panel d. bottom, For simplicity, we drew a line to roughly separate the purple and blue neurons of Clusters 1 and 2. We used this line to divide the neurons into two groups, called Consensus Group 1 and Consensus Group 2. These Consensus Groups were used to make Fig. 5 (see more explanation in Methods, ‘The simpler approach to the neuron groups 1 and 2’ used in Fig. 5). o, Average±s.e.m. of GLM coefficients across neurons. Neurons grouped into Consensus Group 1 (purple) and Consensus Group 2 (cyan). “Align” shows cue coefficients after subtracting pre-cue baseline (i.e., t < 0 s). p, Integral-normalized histograms of behavior metrics from the post-outcome period. left, Chewing duration after a successful reach, comparing cued to uncued successes. P-value from Wilcoxon rank sum test is 0.6. right, Number of additional, confirmatory reaches after a failed reach, comparing cued to uncued failures. P-value from Wilcoxon rank sum test is 0.04. n = 3685 cued successes, 916 uncued successes, 4724 cued failures, 2414 uncued failures from 17 mice.