Vectorized instructive signals in cortical dendrites

Francioni, Valerio; Tang, Vincent D.; Toloza, Enrique H. S.; Ding, Zilan; Brown, Norma J.; Harnett, Mark T.

doi:10.1038/s41586-026-10190-7

Download PDF

Article
Open access
Published: 25 February 2026

Vectorized instructive signals in cortical dendrites

Valerio Francioni^1,2,
Vincent D. Tang^1,2^na1,
Enrique H. S. Toloza^1,3,4^na1,
Zilan Ding^1,2^na1,
Norma J. Brown^1,2 &
…
Mark T. Harnett ORCID: orcid.org/0000-0002-5301-1139^1,2

Nature (2026)Cite this article

Subjects

Abstract

Vectorization of teaching signals is a key element of almost all modern machine learning algorithms, including backpropagation, target propagation and reinforcement learning. Vectorization allows a scalable and computationally efficient solution to the credit assignment problem by tailoring instructive signals to individual neurons. Recent theoretical models have suggested that neural circuits could implement single-phase vectorized learning at the cellular level by processing feedforward and feedback information streams in separate dendritic compartments^1,2,3,4,5. This presents a compelling, but untested, hypothesis for how cortical circuits could solve credit assignment in the brain. Here we used a neurofeedback brain–computer interface task with an experimenter-defined reward function to test for vectorized instructive signals in dendrites. We trained mice to modulate the activity of two spatially intermingled populations (four or five neurons each) of layer 5 pyramidal neurons in the retrosplenial cortex to rotate a visual grating towards a target orientation while we recorded GCaMP activity from somas and corresponding distal apical dendrites. We observed that the relative magnitudes of somatic and dendritic signals could be predicted using the activity of the surrounding network and contained information about task-related variables that could serve as instructive signals, including reward and error. The signs of these putative teaching signals depended on the causal role of individual neurons in the task and predicted changes in overall activity over the course of learning. Furthermore, targeted optogenetic perturbation of these signals disrupted learning. These results demonstrate a vectorized instructive signal in the brain, implemented via semi-independent computation in cortical dendrites, unveiling a potential mechanism for solving credit assignment in the brain.

Main

Learning is the product of changes in the strength of synaptic connections between neurons^{6,7,8,9,10,11,12,13}. Synaptic modifications can have difficult-to-predict effects on network output, particularly in complex hierarchical networks such as the brain. The challenge of determining how individual synapses should be altered to improve task performance is known as the credit assignment problem^{14,15,16,17,18}. Whereas this problem is effectively solved in artificial neural networks (ANNs) by the backpropagation-of-error algorithm¹⁹, how credit assignment is solved in the brain remains unknown^14,15.

Recent theoretical work has proposed several models by which biological circuits could solve credit assignment, including target learning and backpropagation-like algorithms^{1,2,3,4,5,20,21}. Central to both artificial and biologically inspired solutions to credit assignment is the vectorization of instructive signals, as opposed to the broadcasting of a single scalar teaching signal¹⁴. Effective learning requires, in addition to vectorization, instructive signals to be separable from feedforward inputs to prevent interference¹⁵. In ANNs, this is achieved via temporal separation, which has long been thought to be biologically implausible. One hypothesis is that in cortex, credit-related information is spatially, rather than temporally, segregated in the apical dendrites of pyramidal neurons¹⁵. This aligns with anatomical and circuit evidence that feedforward inputs are received perisomatically and feedback inputs are received in the distal dendrites^{22,23,24,25,26,27,28,29,30,31}. However, direct evidence regarding the subcellular mechanisms of credit assignment is lacking.

Vectorized teaching signals at the dendritic level should meet four experimentally testable conditions. First, dendritic activity should contain information that is not present in somatic activity alone (although somas could theoretically transmit gradients using qualitatively different spiking patterns^2,4,32, the cable properties of dendrites predict some level of independence between somatic and dendritic activity). Second, dendritic activity should encode information about task performance that could serve as instructive signals, such as reward and error representations. Third, dendritic activity should reflect the contribution of that neuron to task performance (that is, the reward function). Fourth, disrupting vectorized instructive dendritic signals should impair learning.

Specifying a reward function using a BCI task

Evaluating credit assignment in biological neural networks has thus far proved impossible^14,15. Teaching signals can only be defined relative to a reward function that maps neural activity to task performance. It is unclear whether such functions are explicitly represented in the brain. Even if they are, experimenters are blind to their specific formulation in terms of neural activity¹⁵. Neurofeedback brain–computer interface (BCI) tasks present a potential solution to this problem by directly coupling neural activity to task performance, thereby allowing the experimenter to specify the reward function to be optimized^14,20,21. Previous studies have shown that mice are able to learn BCI tasks using a variety of feedback stimuli and brain areas and that learning induces changes in the activity of the neurons controlling the BCI, including in the hippocampus and various sensory and motor cortices^{33,34,35,36,37,38,39}. Here we leveraged a visually guided neurofeedback BCI task in cortical pyramidal neurons to test subcellular mechanisms for error and reward-related signalling (Fig. 1a–c and Supplementary Figs. 1 and 2). We trained head-fixed mice under a 2-photon microscope to control the activity of two spatially intermingled sets of GCaMP7f-labelled layer 5 pyramidal neurons, in the retrosplenial cortex (RSC), designated P+ and P− (selection criteria in Extended Data Figs. 1 and 4b and Methods). The difference in mean somatic GCaMP activity of P+ versus P− neurons was coupled to rotation of a visual grating relative to a rewarded target angle^{33,34,35,36,38,39} (Fig. 1d–f and Supplementary Data Fig. 1). We selected RSC owing to the optical accessibility of layer 5 and previous demonstration of independent dendritic events in this area⁴⁰. We recorded GCaMP activity at 15 Hz in the proximal trunk dendrite as a proxy for somatic activity; this allowed imaging of many neurons while reducing signal contamination owing to the more precise spatial footprint and faster signal kinetics of the apical trunk^41,42,43. We measured task performance with two metrics: accuracy, which represented the fraction of rewarded trials; and speed, which represented the number of rewards obtained per minute. Mice (n = 6) learned the task by both metrics (Fig. 1g and Extended Data Fig. 2 and 3).

**Fig. 1: Mice learn a neurofeedback BCI task through the differential regulation of P+ and P− neurons.**

We compared activity levels of P+ and P− populations, as well as the population of surrounding neurons that were not directly involved in the rotation of the stimulus (termed P₀), across days of task performance. We imaged the same neurons longitudinally throughout all experiments. We found that learning was accompanied by the differential regulation in the activity of P+ and P− neurons over days (Fig. 1h,i), with P+ neurons maintaining their activity levels while P− neurons were downregulated. Whereas, on average, changes in activity in P₀ neurons resembled changes in P+ neurons (Fig. 1i), selecting the subpopulation of P₀ neuron with matching activity levels of P+ and P− neurons on day 1 revealed that changes in activity in P₀ neurons fell in between those of P+ and P− neurons (Extended Data Fig. 4). As the most active neurons on day 1 were also those that were most strongly downregulated (Extended Data Fig. 4c), our results are consistent with a model of learning by sparsification, an energy-efficient solution to the task⁴⁴. Increases in task performance were not correlated with changes in locomotion across days (Extended Data Fig. 3). Moreover, the P+ and P− populations were spatially intermingled, and had the same GCaMP transient frequency on day 1 (Extended Data Figs. 1 and 4a), ruling out the possibility of learning the task by simply engaging a non-specific gain modulation mechanism.

Dendrites contain information not found in their somas

To determine whether apical dendritic activity contained information that was not encoded in parent somatic activity alone, we used an electrically tunable lens to semi-simultaneously (15 Hz per plane) record activity in proximal and distal trunk dendrites across learning (Fig. 2a). We paired proximal and distal dendrites on the basis of the Pearson correlation of their GCaMP signals, thresholded at r = 0.6 as in previous studies^41,42,43. Previous work in brain slices demonstrated that dendritic GCaMP signals are larger when current is injected in the distal trunk and smaller when current is injected at the soma⁴¹ (controlling for the same number of triggered corresponding action potentials). This indicates that differences in somatic versus dendritic magnitude for coincident GCaMP events reflect the spatial bias of the different inputs that target these two compartments. To estimate the magnitude of somatic and dendritic events, we first deconvolved the GCaMP traces of somas and dendrites using CASCADE⁴⁵. Deconvolution allowed us to correct for the well-described problem of different signal kinetics across dendritic compartments⁴⁶. Next, we used an area-under-the-curve approach to quantify the magnitude of individual transients (all main results were also validated using a ΔF/F₀-based approach to estimation magnitude of transients; Methods and Supplementary Fig. 3) and defined events as coincident whenever they occurred within 500 ms of each other. As these coincident events represent the vast majority of GCaMP transients^{40,41,42,43,46,47,48,49,50,51,52}, we focused all subsequent analysis on events for which a transient was detected in both compartments.

**Fig. 2: Differences in somatic and dendritic magnitudes for coincident events are predicted by local network dynamics.**

Empirically, we observed that the relative magnitude of coincident events in somas and dendrites varied substantially, despite event timing correlation being very high (Fig. 2b; consistent with prior studies^{40,41,43,46,47,49}). As event magnitudes at soma and dendrites were best described by a linear relationship (Extended Data Figs. 5 and 6b), we assessed the relative degree of dendritic amplification versus attenuation with a best-fit line through all events and then calculated the somato-dendritic residual (SD residual) associated with individual transients⁴³ (Fig. 2b,c). This captured the variance of dendritic responses for a given somatic event magnitude. We then defined positive and negative residuals as dendritically amplified and attenuated events, respectively.

To test whether SD residuals contain information that is biologically meaningful, we used activity from all the somas in our field of view in the 2 s preceding individual GCaMP events in a neuron of interest (P+ and P− neurons on days 1 to 14) to predict whether these events were dendritically amplified or attenuated (Fig. 2d). To do so, we used a linear support vector machine (SVM), a common algorithm to both classify and regress using high-dimensional data. We found that the performance of our binary classifier on individual neurons strongly correlated with the ability of the decoder to capture the magnitude of dendritic amplification or attenuation in the classification confidence (Fig. 2e,g,h and Extended Data Figs. 6c,d and 7a,b). This was an emergent property, as the decoder was trained for binary classification only and had no information about the magnitude of dendritic amplification or attenuation. Among 466 neurons, approximately 20% showed a significant correlation between classification confidence and the magnitude of SD residual (Fig. 2h and Extended Data Figs. 6c,d and 7a,b). We found that in these neurons, we could accurately decode 61% of the events as being either amplified or attenuated, well above the 50% chance level (Fig. 2j and Extended Data Figs. 6e and 7c). Additionally, at the single-cell level, we found a statistically significant positive Pearson correlation between classification confidence and SD residual, demonstrating that the surrounding network of neurons can be used to predict the amplitude of the residual for coincident somato-dendritic transients (Fig. 2k and Extended Data Figs. 6f and 7d). Of note, our analysis approach completely decorrelates somatic event magnitude from SD residuals (Fig. 2f,I and Extended Data Fig. 6a), indicating that mismatches in somato-dendritic coupling are predicted independently from somatic activity and represent information encoded de novo in the dendrites. Additionally, our results demonstrate that P₀ neurons could be decoded at the same level as P+ and P− neurons (Extended Data Fig. 8), and that decoding does not depend on somatic responses to visual stimuli across the three subpopulations (Extended Data Fig. 9).

We further found that dendritically amplified events consistently peaked earlier than dendritically attenuated events compared with the soma (Fig. 2l,m and Extended Data Figs. 6g and 8e), congruent with results in brain slices⁴¹.

Experimental perturbation of SD residuals

Previous studies indicate that anaesthesia reduces top-down input and/or inhibits apical tuft dendrites in layer 5 pyramidal neurons^22,53,54,55. We therefore hypothesized that the SD residual should be reduced during anaesthesia compared with wakefulness. To test this, we simultaneously recorded somatic and dendritic activity of layer 5 pyramidal neurons in RSC during these two conditions (Fig. 3a–c). Consistent with previous findings²², we observed a marked effect of anaesthesia on the frequency of GCaMP transients (Fig. 3d). For each neuron, we used all events detected during wakefulness to establish the distribution of SD residuals during awake periods. We then measured the effect of anaesthesia on the SD residual using the best-fit somato-dendritic line calculated during wakefulness. Anaesthesia strongly reduced the SD residual (Fig. 3c,e), consistent with previous observations of decreased top-down input^22,55.

**Fig. 3: Experimental manipulation of SD residuals.**

Prior work has also demonstrated that NDNF-positive layer 1 inhibitory interneurons can inhibit the apical dendrites of pyramidal neurons^53,54. We therefore tested whether NDNF-mediated inhibition reduced SD residuals, indicative of a preferential effect on apical dendritic activity. To do so, we co-injected NDNF-Cre mice (n = 4) with both a Cre-dependent version of ChRmine in layer 1 and GCaMP7f expressed under the control of the synapsin promoter in layer 5 (Fig. 3f,g). We then recorded somatic and dendritic GCaMP activity of individual layer 5 neurons, in the presence and absence of layer 1 NDNF⁺ interneuron activation via an LED light (Fig. 3h and Extended Data Fig. 10). Similar to our approach during anaesthesia, we first established the control relationship between somatic and dendritic event amplitudes for each neuron and then compared this to the SD residuals of activity during optogenetic activation. NDNF⁺ interneuron activation reduced the frequency of GCaMP transients (Fig. 3i) and, consistent with a number of previous ex vivo and in vivo studies^53,54,56, strongly decreased the SD residual in individual layer 5 pyramidal neurons (Fig. 3h,j). LED illumination alone (conducted in a control cohort of five mice) was not responsible for these results (Extended Data Fig. 10). Together, these results demonstrate that SD residual is predictably affected by two independent experimental manipulations in vivo, establishing it as a robust metric of dendritic versus somatic activity.

SD residuals decode reward and trial outcome

Next, we evaluated whether the SD residual contained information about task-related variables that could serve as putative teaching signals. We first tested whether changes in SD residual at the population level contained reward-related information. For each imaging session, we decoded rewarded versus unrewarded trials by comparing the 2 s following neural activity reaching target activation on rewarded trials with the analogous 2 s timeout period during unrewarded trials (Fig. 4a–c). Using a linear SVM trained on SD residuals (Methods), we were able to decode at 63% accuracy on average, above both chance and shuffle performance (Fig. 4d,e and Extended Data Figs. 6 and 11a,b).

**Fig. 4: A population vector of SD residuals contains reward-related information.**

Next, we tested whether inputs onto the apical tuft dendrites represent instructive signals during learning. We used SD residuals to decode successful versus unsuccessful trials in the 2 s periods preceding successful target activation versus timeout, respectively. Once again, we found that our decoder performed significantly above chance at 57% accuracy on average (Fig. 4c,f,g and Extended Data Figs. 6 and 11c,d), demonstrating that individual neurons encode information about the network states that correspond to successful versus unsuccessful outcomes in their SD residuals both before and after reward delivery. As the trial time we analysed is pre-outcome, our results indicate that the SD residuals encode instructive signals based on the task-associated reward function.

Finally, we tested the role of layer 1 inhibition in controlling dendritic signals encoding reward and trial outcome (Fig. 4h). To do this, we performed experiments on a second set of four mice expressing ChRmine in NDNF⁺ layer 1 interneurons. Optogenetic activation of layer 1 NDNF⁺ neurons, but not LED illumination alone (conducted in a control group of five mice), abolished task and reward-related information in the apical dendrites of layer 5 pyramidal neurons (Fig. 4i–l and Extended Data Fig. 10), highlighting a potential role for local cortical inhibition in dendritic processing of task-related variables.

SD residuals reflect neuron-specific task error signals

We exploited the explicit definition of error and of functionally opposite classes of neurons in our experimental design to test whether error signals are received at apical dendrites and, if so, whether they differ between neurons according to each neuron’s causal role in the task (Fig. 5a,b). We reasoned that a scalar error signal would manifest as amplified dendritic activity during periods of error reduction for both P+ and P− neurons and as attenuated dendritic activity during times of error increase. However, a vectorized error signal would exhibit selective P+ versus P− dendritic activation, as the activity of each group is causally mapped to error in opposite ways. To disambiguate between these scenarios, we averaged the error in 2 s windows throughout the task and defined each window as an error increase or decrease epoch, given that the angle of the visual stimuli presented to the mice represented the instantaneous task-associated error (Fig. 5a). Next, we calculated the SD residuals for P+ and P− neurons for coincident soma–dendrite events in each window during error decrease and error increase epochs. As our analysis was restricted to time bins with coincident somato-dendritic events in P+ and/or P− neurons, any potential noise-driven flickering was not present in our analysis. We found that the dendrites of P+ neurons were relatively amplified during error reduction compared with error increase epochs (Fig. 5c). Dendrites in P− neurons exhibited the converse relationship: relative dendritic attenuation and amplification occurred during error reduction and error increase, respectively (Fig. 5d–f and Extended Data Figs. 12 and 6). This relationship could be observed in six out of six mice trained in the task (Extended Data Fig. 13e) and remained intact when we restricted our analysis to neurons whose somatic activity was the same during epochs of error increase and reduction (Extended Data Fig. 13). Additionally, the same inverted relationship between dendritic signals and task-associated errors was found in the dendrites of P₀ neurons that were functionally correlated to P+ and P− neurons (Extended Data Fig. 14). Of note, SD residuals represented error derivatives, not errors (Extended Data Fig. 13), in contrast to instructive signals found in the classical implementations of backpropagation.

**Fig. 5: Dendritic error signals are cell-specific and depend on the causal contribution of the neuron to the task.**

Next, we tested whether vectorized error-related dendritic signals were necessary for learning by optogenetically activating NDNF⁺ layer 1 interneurons throughout the BCI task. This abolished vectorized error-related signalling in the apical tuft of layer 5 pyramidal neurons (Fig. 5f,g) and disrupted learning (Fig. 5h) but not in our LED illumination control group (Extended Data Fig. 15). This demonstrates that local computation in the apical dendritic tuft is necessary for performance improvements in the BCI task.

Discussion

Here we demonstrate the use of neurofeedback brain–computer interfaces to study the mechanisms of biological credit assignment at the subcellular level. Our results provide—to our knowledge—the first biological evidence of a vectorized solution to the credit assignment problem in the brain via cortical dendrites. Our data are consistent with a model of credit assignment in which learning is instructed by instantaneous, vectorized teaching signals received onto the distal dendrites of pyramidal neurons^1,2,3,4,5. This spatial segregation mechanism allows cortical circuits to overcome the biologically implausible temporal separation of feedforward and feedback streams conventionally used for computing teaching signals during vectorized learning in ANNs.

The data presented here reveal magnitude differences in coincident somato-dendritic events that can be predicted using activity in the surrounding network of neurons. At the population level, differences in somato-dendritic coupling encode de novo information relative to somatic activity. This information could be used by individual neurons as instructive signals, such as reward and task error, providing novel evidence that individual neurons can explicitly access the reward function of a learning task through independent dendritic computation. We further demonstrate that cell-specific changes in SD residuals correlate with the functional role of individual neurons as well as with subsequent changes in activity levels during learning. Finally, optogenetic activation of NDNF⁺ layer 1 interneurons disrupted both dendritic computation and learning, demonstrating that dendritic processing is necessary for learning.

Our results demonstrate the existence of a signed, vectorized dendritic input that is tailored in a condition-specific manner to individual neurons. The extent to which this dendritic activity reflects moment-to-moment computational signals—as opposed to teaching signals for synaptic weight changes—remains to be uncovered. This could be explored via manipulations of dendritic activity through different phases of learning⁵⁷. Importantly, vectorized error signals need not be confined to learning. In control-theory-inspired frameworks of credit assignment, error signals are applied during task operation, to steer the system online^58,59. Further work is needed to assess whether these dendritic signals result from glutamatergic inputs from higher-order cortical areas, from neuromodulation, or as a product of recurrent excitatory and inhibitory local computation. Dopaminergic signalling specifically has been causally implicated in both error signalling and in learning neurofeedback BCI tasks in rodents and humans^33,60,61,62 and thus represents a compelling target for future investigation. Further experiments are also needed to test whether errors signals are calculated locally at each hierarchical layer or are transmitted across layers, as in the classical formulation of backpropagation¹⁹. Previous neurofeedback BCI studies have demonstrated that degrading the contingency between neuronal activity and feedback stimuli impairs learning^33,34,35: future work will have to determine whether external stimuli are always necessary for error representations or whether animals can access the cost function via internal states exclusively, and how the dendritic representation of error might change as a result.

The error signals we observed have appealing connections to the gradient calculations found in the backpropagation algorithm. In contrast to the classical implementation of backpropagation, however, we observed that dendrites received signals that bore signatures of error derivative rather than error itself. Intriguingly, our results could also be consistent with target propagation (specifically, difference target propagation)^14,20,21. Indeed, our data indicate that dendritic activity contains a target signal for the parent soma in addition to task-related error information. Future approaches, built on the framework we present here, could be used to disentangle the specific learning algorithms used by the brain^14,63.

Together, our results help to reconcile early findings and theories of dendritic function, which focused on single dendritic branches as the building blocks for independent computation, with later in vivo findings that have demonstrated prevalent co-occurrence of dendritic and somatic events^15,24,51,64. By demonstrating that apical dendrites locally compute reward and error-related signals, our results present a framework for dendritic computation that does not require fully independent dendrites to perform credit assignment for adaptive behaviour and highlight new directions for the development of biologically inspired ANNs.

Methods

Animals

All experiments were compliant with guidance and regulation from the NIH and the Massachusetts Institute of Technology Committee on Animal Care. Male and female Rbp4-Cre and NDNF-Cre heterozygous mice were maintained on a 12 h:12 h light:dark cycle in a temperature- and humidity-controlled room with ad libitum food access and were used for experiments at 8–15 weeks of age. Except for anaesthesia experiments, mice were water-deprived by decreasing water intake from 3 ml to 1.2 ml over the course of 10–14 days and maintained at 1.2 ml thereafter, for 5–7 days before experiments and throughout training.

Surgery

Mice were initially anaesthetized using 4% isoflurane and subsequently maintained at 1–2% isoflurane through the rest of the surgery. Body temperature was maintained at physiological levels using a closed-loop heating pad. Additional heating was provided for post-surgical recovery. To protect eyes from dryness, eye cream (Bepanthen, Bayer) was applied. Mice were injected with dexamethasone (4 mg kg⁻¹), carprophen (5 mg kg⁻¹) and buprenorphine (slow release, 0.5 mg kg⁻¹) subcutaneously. The scalp was shaved using hair removal cream and cleaned afterwards using iodine solution and ethanol. Next, the skull was exposed. For in vivo imaging, a 3 mm-wide craniotomy was performed. In Rbp4-Cre mice, at 3–4 different sites, we injected 100 nl of AAV1-syn-FLEX-jGCaMP7f-WPRE (Addgene, 04492-AAV1, 2–5 × 10¹² viral genomes (vg) ml⁻¹ concentration after a 1:10 dilution from the original concentration) at 400 μm from the surface of the brain in the left hemisphere of the RSC (2.5 mm caudal of bregma). The same labelling approach was used to perform anaesthesia experiments. In NDNF-Cre mice, we injected 100 nl of AAV8-nEF-Con/Foff 2.0-ChRmine-oScarlet (Addgene, 137161-AAV8, 7 × 10¹² vg ml⁻¹ after a 1:5 dilution from the original concentration) 150 μm from the surface of the brain and 75 nl of AAV1-syn-jGCaMP7f-WPRE (Addgene, 104488-AAV1, 2–5 × 10¹² vg ml⁻¹ concentration after a 1:10 dilution from the original concentration) 500 μm from the surface of the brain. The dura was left intact. Cranial windows consisted of two stacked 3 mm coverslips (inserted within the craniotomy) attached to a larger 5 mm coverslip which was subsequently fixed to the skull using cyanoacrylate glue and dental cement. A custom metal headplate was implanted to perform imaging under head-fixed conditions. At the end of the procedure, a single dose of 25 ml kg⁻¹ of Ringer’s solution was injected subcutaneously to rehydrate the mouse. Recordings started 4–6 weeks post-surgery.

Two-photon imaging

A Neurolabware 2-photon microscope equipped with GaAsP photomultiplier tubes was used for data acquisition. Imaging was performed at 980 nm using an ultrafast pulsed laser (Spectra-Physics, Insight DeepSee) coupled to a 4× pulse splitter to reduce photodamage and bleaching. For excitation and photon collection we used a 16× Nikon objective with 0.8 numerical aperture. Bidirectional scanning was performed (512 × 796 pixels) semi-simultaneously in two separate planes using an electrically tunable lens at 30.92 Hz (15.46 Hz for each plane). Laser intensity was independently optimized at each imaging plane using an electro-optical modulator. A custom light shield was attached to the headplate to avoid light contamination. Mice were habituated to human handling for 5–10 min every day and to head-fixation for 15 min a day for at least 3 days directly preceding imaging. Small 10% sucrose water rewards were randomly dispensed during habituation. Daily water intake of at least 1.2 ml was maintained throughout the behavioural experiments. The locomotion of the mouse was recorded using an optical encoder (E6, US Digital, 2500 cycles per revolution) tracking the rotation of a cylindrical treadmill with a radius of 19 cm and acquired using the Scanbox software interfaced to a custom-built Arduino system. To maximize the number of units recorded while simultaneously reducing signal contamination, we imaged the trunk of layer five pyramidal neurons at two different planes: proximal to the soma and right below the nexus (tuft bifurcation point).

Optogenetic stimulation

For optogenetic stimulation we used a Cyclops LED driver from open ephys (OEPS-6602) triggered using a direct 6 ms TTL pulse delivered via the Neurolabware Dual PSOC box. The driver controlled a fibre-coupled 595 nm LED (8.7 mW, 100 mA Thorlabs M595F2). LED illumination was synchronized with the photomultiplier tubes (PMTs) of the imaging system using custom-made Matlab scripts. In brief, for every new frame acquired by the 2-photon microscope, the LED was activated for the initial 6 ms of the frame, while the PMTs were kept shut off for an additional 1 ms (7 ms total of PMT off time). PMTs would then reactivate to collect calcium data for the remaining approximately 24 ms of the approximately 31 ms frame.

BCI task

Similar to previous implementations of BCI learning paradigms^34,35,38, mice were trained so that they obtained rewards by modulating the activity of eight or ten layer 5 pyramidal neurons in the RSC to control the rotation of a grating Gabor patch. The 8 or 10 neurons were equally divided into 2 subpopulations, P+ neurons whose activity rotated the stimulus towards a target angle of 90° (horizontal) and P− neurons whose activity rotated the stimulus away from the target angle, towards a 0° (vertical) orientation. Neural activity was transformed into a visual stimulus angle according to the following method: at the beginning of each session, we measured the baseline responses of P+ and P− neurons to 7 randomly presented oriented gratings (0°, 15°, 30°, 45°, 60°, 75° and 90°, passive viewing) for approximately 13 min (12,000 frames). ΔF/F₀ was calculated for individual P+ and P− neurons and averaged across each population. The mean P− population signal was subtracted from the mean P+ population signal. Next, we randomly resampled 200 trials (435 frames each) from the aforementioned 12,000-frame baseline recording and iteratively searched (in 0.005 ΔF/F₀ incremental steps) for the subtracted ΔF/F₀ value producing a 50% success rate. That value was set as the threshold value for target activity during the closed-loop phase of the BCI task. Next, we calculated the mean and s.d. of the subtracted ΔF/F₀ signal distribution and created a new distribution by mirroring the left side to the right. On day 1, we estimated the z-score corresponding to the ΔF/F₀ threshold value on the mirrored distribution. On the following days, we estimated the subtracted ΔF/F₀ signal distribution and its corresponding left-mirrored distribution in the same way as described above, and used the ΔF/F₀ value corresponding to the z-score used on day 1 as the task target activity during the closed-loop phase of the task. In this way mice could learn the task by either decreasing activity of P− neurons or increasing activity in P+ neurons (or both). The mapping between neuronal activity and visual feedback angle was defined as follows: 0° angle corresponded to the minimum value in the subtracted ΔF/F₀ signal distribution while target, or 90° angle was reached at subtracted ΔF/F₀ value corresponding to threshold defined as described above. Activity in between was split into 7 equally spaced bins each corresponding to a 15° interval between 0° and 90°. At each screen refresh, the angle presented reflected the mapping between angle bins and the subtracted ΔF/F₀ signal averaged over the last 3 frames. The screen refreshed every time a 2-photon frame at the soma was acquired (at 15 Hz). In line with previous studies performing neurofeedback BCI in rodents^{33,34,35,36,38}, we binned the visual stimulus to avoid noise-driven, frame-by-frame stimulus updates at the screen refresh rate, which is beyond the perceptual threshold in mice^65,66. To avoid introducing a second, orthogonal dimension to our task that would disrupt the straightforward mapping between neuronal activity and task error, we did not introduce any requirement on the number of P+ or P− neurons required to be simultaneously active to trigger a reward or a stimulus update. In each trial, mice had 28 s to reach target activity. If they did, a reward, consisting of 4 μl of 10% sucrose water was delivered 1 s after. Additionally, after reaching target activity, the stimulus froze to a 90° angle for 2 s. After that, mice saw a black screen for an additional 1 s and a new trial was initiated. All new trials were initiated by a 0.5 s isoluminant grey stimulus. If a mouse did not reach target activity within the 28 s of the trial, a 3 s timeout was given to them consisting of a black screen. To avoid the problem of drifting baselines, ΔF/F₀ for each neuron was calculated as (F_i – F_i0)/F_i0 where F_i0 was the 10th percentile of fluorescence in the previous 30 s. For the optogenetics experiments, we recorded 2 different baselines (13 min each, during passive viewing of the Gabor patch—same as described above). The first one with the LED off (control) was used for post hoc analysis only of the data shown in Fig. 3. The second baseline was recorded during LED stimulation (opto on) and was used to map neuronal activity to angle and target during the closed-loop part of the BCI task. This mapping was crucial to ensure that the decoder was calibrated consistently for the closed-loop BCI task, which was recorded during the opto on condition only. Early and late training were defined as days 1–8 and 9–14 respectively, based on average performance (accuracy) remaining above 0.75 in the control (opto off) condition. For anaesthesia experiments, we recorded two passive viewing sessions (awake and anaesthetized, 24 min each) where we presented the same set of stimuli presented when recording the baseline session for the BCI task. We anaesthetized the mice in between these two sessions by initially administrating (via inhalation) 4% isoflurane that subsequently decreased to 1% for the duration of the imaging session.

P+ and P− neuron selection

On day 1, we drew 20–40 regions of interest (ROIs) in a single field of view prior to starting our baseline recording. Next, we recorded a session of passive visual stimuli that we would later use as our baseline recording for day 1. At the end of this recording, all ΔF/F₀ traces for all drawn ROIs were plotted and visually inspected using a custom Matlab script. The experimenter would then select either eight or ten of these traces based on event frequency, signal to noise ratio (SNR; determined as the ratio between noise band width and maximum event size), baseline stability and calcium transient dynamics (with a clear rise, peak, and exponential decay—as opposed to plateau-looking events). The best 8–10 neurons would then be selected from the available pool of neurons on which ROIs were drawn. No arbitrary parameter cut-off (for example, minimum event frequency or SNR) was introduced. The subdivision of these eight to ten neurons into the P+ and P− population would then be determined by a random number generator. Once selected, P+ and P− neurons would remain the same for the entire duration of the experiment.

Online motion-correction

To avoid drifts in x and y out of our selected regions of interest, we used a fast Fourier transformation approach to live motion-correct our movies. To do so, at the beginning of each recording session we acquired a reference image by averaging 20–40 s (300–600 frames) collected onto our field of view. To motion-correct each subsequent frame, we selected four smaller central areas to register independently from one another (2D rigid translation) against the corresponding four areas in the reference image⁶⁷. We finally rigid-translate the entire 2D image by taking the average translation in x and y for these four subregions.

Visual stimuli

Visual stimuli were generated using the Psychophysics Toolbox package for MATLAB (MathWorks)⁶⁸ and displayed on a monitor 20 cm away from the contralateral eye. Visual stimuli consisted of a rotating Gabor patch at 7 angles spaced 15° apart from 0° to 90°.

Offline image analysis and signal extraction

To correct for brain motion after image acquisition, as well as to automatically detect ROIs, we used the Suite2p pipeline⁶⁹. For each field of view, we removed duplicates by excluding ROIs whose signal correlation was above 0.6 and whose centre was within 20 μm of distance. In order to separate trunk signals from potential neuropil contamination, fluorescence signals of our ROIs were processed using FISSA⁷⁰ with the following parameters: 4 neuropil subregions and alpha = 0.1. To estimate ΔF/F₀ after neuropil subtraction, we calculated ΔF/F₀ at time point i as (F_i – F₀)/F₀. F₀ is defined as the tenth percentile of a 120-s-long sliding window to remove fluorescence drifts over the course of imaging. Next, we performed spike inference using the CASCADE model Global_EXC_15Hz_smoothing200ms⁴⁵.

Field of view matching and ROI registration across days

Registration of neurons across days for BCI training was performed manually at the beginning of each session with the help of a custom-designed software. On day 1, a mean intensity reference image of our field of view of interest was acquired. Using a custom-designed software, we manually drew 10–20 reference ROIs, which included any recognizable brain structure including dendrites, cell somas and sharp-contrast blood vessels. On the following days, after manually finding the same approximate area for the field of view imaged on day 1, a more accurate manual registration was performed by aligning our reference ROIs drawn on day 1 with their corresponding structures on following days. As the relative x and y distance between structures varies along the z-plane, our approach allowed us to consistently match our field of view on day 1 across x, y and z dimensions on any given day. Offline registration of ROIs across days on the other hand, was initially performed using the ROIMatchPub implementation for Suite2p followed by an exhaustive manual curation.

Quantification of event frequency, magnitude and timing

Events were detected for each ROI using the MATLAB function findpeaks on the spike-inferred signal. For analysis of the spike-inferred signal, we estimated the integral of individual peaks by multiplying the height and width of individual transients. Event occurrence was defined as the time at which spike probability peaked. For ΔF/F₀ analysis, once we found an event, we used a 2 s backward sliding window to identify the frame at which the derivative of the ΔF/F₀ signal became consecutively positive for 300 ms. This was considered the transient onset frame while the peak of the transient was considered the maximum ΔF/F₀ value in the 2 s following peak detection. We therefore estimated the integral of the ΔF/F₀ signal by multiplying the height (maximum ΔF/F₀ value – ΔF/F₀ value at transient onset) and the width (frame at maximum ΔF/F₀ value – frame at transient onset) of the ΔF/F₀ signal. The backward and forward detection windows were limited in time by the presence of a precedent or subsequent event detected using the spike-inferred signal. Proximal trunks were paired to their corresponding distal trunk whenever their ΔF/F₀ signal correlation was equal above 0.6. For optogenetics and anaesthesia experiments, we matched proximal trunks to their corresponding distal trunk using activity during control (opto off) and wakefulness, respectively. Whenever we found more than one distal dendrite correlated with the same proximal trunk, we selected the one with the best signal-to-noise ratio, so to always have a single distal dendrite associated with a proximal trunk. Coincident events were defined as two events occurring (independently detected) within a 500 ms window in the two compartments. To quantify the somato-dendritic magnitude mismatches of coincident events, we first fit a best-fit line against the somatic and dendritic magnitudes of all events. For each event, we calculated the residual from the best-fit line, and defined residuals larger than 0 as dendritically amplified and residuals smaller than 0 as dendritically attenuated. To estimate the SD residual during optogenetic stimulation and anaesthesia, first we estimated the best-fit line using somato-dendritic activity during light off and wakefulness conditions, respectively. Next, we calculated the residual for all events detected during opto on and anaesthesia as the distance between these events and the previously calculated best-fit line.

Decoders

To decode whether individual transients would be amplified or attenuated, we trained a SVM binary classifier (linear kernel) using stochastic gradient descent⁷¹ (as implemented by MATLAB fitclinear). For each coincident event in the soma and dendrites, we averaged the spike-inferred activity of each neuron in our field of view (excluding the neuron of interest) in the preceding 2 s, and we used this average activity to create an n-dimensional population activity vector where n corresponds to the number of isolated units in our field of view. The binary classifier was trained to separate dendritic amplification from dendritic attenuation (see above) using a leave-one-out approach. Accuracy was determined as the fraction of correctly classified events. For imbalanced datasets, we used a synthetic minority oversampling technique (SMOTE, k neighbors = 5) to train (not test) using a balanced dataset. SMOTE was applied after separating our train and test datasets. To control for any potential data leakage, our shuffle control went through the exact same procedure as our test dataset, including SMOTE oversampling with the only difference that labels were randomly shuffled before separating the train and test data. We calculated the confidence of a prediction as the Euclidean distance from the hyperplane. Reward-associated and reward-instructive epochs were defined as 2 s before and 2 s after the reach of target activity, respectively for successful trials, and 2 s before and 2 s after the end of a trial for unsuccessful trials. To decode successful from unsuccessful trials, we generated a n-dimensional SD residual vector by taking the residual for each neuron for which we identified a somato-dendritic pair (see above) in these 2 s epochs. Neurons inactive in the 2 s epochs were assigned a value of 0. The binary classifier was trained in the same manner as described above.

Statistics

All analysis was performed using MATLAB 2020a using custom-written scripts and functions. All error bars in figures represent s.e.m. Statistical tests and independent samples are described in figure legends. All t-tests in the Article are two-sided.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Data are available upon request. Source data are provided with this paper.

Code availability

The BCI code is available at https://github.com/harnett/CLOOSE. Analysis code is available upon request.

References

Sacramento, J., Bengio, Y., Costa, R. P. & Senn, W. Dendritic cortical microcircuits approximate the backpropagation algorithm. In Proc. 32nd International Conference on Neural Information Processing Systems (eds Bengio, S. & Wallach, H. M.) 8735–8746 (ACM, 2018).
Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nat. Neurosci. 24, 1010–1019 (2021).
Article CAS PubMed Google Scholar
Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. eLife 6, e22901 (2017).
Article PubMed PubMed Central Google Scholar
Greedy, W., Zhu, H. W., Pemberton, J., Mellor, J. & Costa, R. P. Single-phase deep learning in cortico-cortical networks. In Proc. 36th International Conference on Neural Information Processing Systems (eds Koyejo, S. et al.) 24213–24225 (ACM, 2022).
Körding, K. P. & König, P. Supervised and unsupervised learning with two sites of synaptic integration. J. Comput. Neurosci. 11, 207–215 (2001).
Article PubMed Google Scholar
Magee, J. C. & Grienberger, C. Synaptic plasticity forms and functions. Annu. Rev. Neurosci. 43, 95–117 (2020).
Article CAS PubMed Google Scholar
Bittner, K. C., Milstein, A. D., Grienberger, C., Romani, S. & Magee, J. C. Behavioral time scale synaptic plasticity underlies CA1 place fields. Science 357, 1033–1036 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Caporale, N. & Dan, Y. Spike timing–dependent plasticity: a Hebbian learning rule. Annu. Rev. Neurosci. 31, 25–46 (2008).
Article CAS PubMed Google Scholar
Artola, A., Bröcher, S. & Singer, W. Different voltage-dependent thresholds for inducing long-term depression and long-term potentiation in slices of rat visual cortex. Nature 347, 69–72 (1990).
Article ADS CAS PubMed Google Scholar
Kirkwood, A., Rioult, M. G. & Bear, M. F. Experience-dependent modification of synaptic plasticity in visual cortex. Nature 381, 526–528 (1996).
Article ADS CAS PubMed Google Scholar
Bliss, T. V. P. & Lømo, T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. J. Physiol. 232, 331–356 (1973).
Article ADS CAS PubMed PubMed Central Google Scholar
Bienenstock, E. L., Cooper, L. N. & Munro, P. W. Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J. Neurosci. 2, 32–48 (1982).
Article CAS PubMed PubMed Central Google Scholar
Abbott, L. F. & Blum, K. I. Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416 (1996).
Article CAS PubMed Google Scholar
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J. & Hinton, G. Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346 (2020).
Article CAS PubMed Google Scholar
Richards, B. A. & Lillicrap, T. P. Dendritic solutions to the credit assignment problem. Curr. Opin. Neurobiol. 54, 28–36 (2019).
Article CAS PubMed Google Scholar
Richards, B. A. et al. A deep learning framework for neuroscience. Nat. Neurosci. 22, 1761–1770 (2019).
Article CAS PubMed PubMed Central Google Scholar
Minsky, M. Steps toward artificial intelligence. Proc. IRE 49, 8–30 (1961).
Article ADS MathSciNet Google Scholar
Lansdell, B. J., Prakash, P. R. & Kording, K. P. Learning to solve the credit assignment problem. In Proceedings of the International Conference on Learning Representations (ICLR) https://openreview.net/forum?id=ByeUBANtvB (2020).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Article ADS Google Scholar
Meulemans, A., Carzaniga, F., Suykens, J., Sacramento, J. & Grewe, B. F. A theoretical framework for target propagation. in Proc. 34th International Conference on Neural Information Processing Systems vol. 33 (eds Larochelle, H. et al.) 20024–20036 (ACM, 2020).
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. Difference target propagation. In Machine Learning and Knowledge Discovery in Databases 9284 (eds Appice, A. et al.) 498–515 (Springer, 2015).
Makino, H. & Komiyama, T. Learning enhances the relative impact of top-down processing in the visual cortex. Nat. Neurosci. 18, 1116–1122 (2015).
Article CAS PubMed PubMed Central Google Scholar
Larkum, M. E., Zhu, J. J. & Sakmann, B. A new cellular mechanism for coupling inputs arriving at different cortical layers. Nature 398, 338–341 (1999).
Article ADS CAS PubMed Google Scholar
Larkum, M. A cellular mechanism for cortical associations: an organizing principle for the cerebral cortex. Trends Neurosci. 36, 141–151 (2013).
Article CAS PubMed Google Scholar
Lafourcade, M. et al. Differential dendritic integration of long-range inputs in association cortex via subcellular changes in synaptic AMPA-to-NMDA receptor ratio. Neuron 110, 1532–1546 (2022).
Article CAS PubMed PubMed Central Google Scholar
Marques, T., Nguyen, J., Fioreze, G. & Petreanu, L. The functional organization of cortical feedback inputs to primary visual cortex. Nat. Neurosci. 21, 757–764 (2018).
Article CAS PubMed Google Scholar
Manita, S. et al. A top-down cortical circuit for accurate sensory perception. Neuron 86, 1304–1316 (2015).
Article CAS PubMed Google Scholar
Rockland, K. S. & Pandya, D. N. Laminar origins and terminations of cortical connections of the occipital lobe in the rhesus monkey. Brain Res. 179, 3–20 (1979).
Article CAS PubMed Google Scholar
Coogan, T. A. & Burkhalter, A. Conserved patterns of cortico-cortical connections define areal hierarchy in rat visual cortex. Exp. Brain Res. 80, 49–53 (1990).
Article CAS PubMed Google Scholar
Cauller, L. J., Clancy, B. & Connors, B. W. Backward cortical projections to primary somatosensory cortex in rats extend long horizontal axons in layer I. J. Comp. Neurol. 390, 297–310 (1998).
Article CAS PubMed Google Scholar
Fişek, M. et al. Cortico-cortical feedback engages active dendrites in visual cortex. Nature 617, 769–776 (2023).
Article ADS PubMed PubMed Central Google Scholar
Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. Proc. Natl Acad. Sci. USA 115, E6329–E6338 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Neely, R. M., Koralek, A. C., Athalye, V. R., Costa, R. M. & Carmena, J. M. Volitional modulation of primary visual cortex activity requires the basal ganglia. Neuron 97, 1356–1368.e4 (2018).
Article PubMed Google Scholar
Clancy, K. B., Koralek, A. C., Costa, R. M., Feldman, D. E. & Carmena, J. M. Volitional modulation of optically recorded calcium signals during neuroprosthetic learning. Nat. Neurosci. 17, 807–809 (2014).
Article CAS PubMed PubMed Central Google Scholar
Clancy, K. B. & Mrsic-Flogel, T. D. The sensory representation of causally controlled objects. Neuron 109, 677–689.e4 (2021).
Article CAS PubMed Google Scholar
Jeon, B. B., Fuchs, T., Chase, S. M. & Kuhlman, S. J. Existing function in primary visual cortex is not perturbed by new skill acquisition of a non-matched sensory task. Nat. Commun. 13, 3638 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Lai, C., Tanaka, S., Harris, T. D. & Lee, A. K. Volitional activation of remote place representations with a hippocampal brain–machine interface. Science 382, 566–573 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Mitani, A., Dong, M. & Komiyama, T. Brain–computer interface with inhibitory neurons reveals subtype-specific strategies. Curr. Biol. 28, 77–83.e4 (2018).
Article CAS PubMed Google Scholar
Koralek, A. C., Jin, X., Long II, J. D., Costa, R. M. & Carmena, J. M. Corticostriatal plasticity is necessary for learning intentional neuroprosthetic skills. Nature 483, 331–335 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Voigts, J. & Harnett, M. T. Somatic and dendritic encoding of spatial variables in retrosplenial cortex differs during 2D navigation. Neuron 105, 237–245.e4 (2019).
Article PubMed PubMed Central Google Scholar
Beaulieu-Laroche, L., Toloza, E. H. S., Brown, N. J. & Harnett, M. T. Widespread and highly correlated somato-dendritic activity in cortical layer 5 neurons. Neuron 103, 235–241.e4 (2019).
Article CAS PubMed PubMed Central Google Scholar
Peters, A. J., Lee, J., Hedrick, N. G., O’Neil, K. & Komiyama, T. Reorganization of corticospinal output during motor learning. Nat. Neurosci. 20, 1133–1141 (2017).
Article CAS PubMed PubMed Central Google Scholar
Francioni, V., Padamsey, Z. & Rochefort, N. L. High and asymmetric somato-dendritic coupling of V1 layer 5 neurons independent of visual stimulation and locomotion. eLife 8, e49145 (2019).
Article CAS PubMed PubMed Central Google Scholar
Padamsey, Z., Katsanevaki, D., Dupuy, N. & Rochefort, N. L. Neocortex saves energy by reducing coding precision during food scarcity. Neuron 110, 280–296.e10 (2022).
Article CAS PubMed Google Scholar
Rupprecht, P. et al. A database and deep learning toolbox for noise-optimized, generalized spike inference from calcium imaging. Nat. Neurosci. 24, 1324–1337 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kerlin, A. et al. Functional clustering of dendritic activity during decision-making. eLife 8, e46966 (2019).
Article PubMed PubMed Central Google Scholar
Hill, D. N., Varga, Z., Jia, H., Sakmann, B. & Konnerth, A. Multibranch activity in basal and tuft dendrites during firing of layer 5 cortical neurons in vivo. Proc. Natl Acad. Sci. USA 110, 13618–13623 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Grienberger, C., Chen, X. & Konnerth, A. NMDA receptor-dependent multidendrite Ca²⁺ spikes required for hippocampal burst firing in vivo. Neuron 81, 1274–1281 (2014).
Article CAS PubMed Google Scholar
Otor, Y. et al. Dynamic compartmental computations in tuft dendrites of layer 5 neurons during motor behavior. Science 376, 267–275 (2022).
Article ADS CAS PubMed Google Scholar
Helmchen, F., Svoboda, K., Denk, W. & Tank, D. W. In vivo dendritic calcium dynamics in deep-layer cortical pyramidal neurons. Nat. Neurosci. 2, 989–996 (1999).
Article CAS PubMed Google Scholar
Francioni, V. & Harnett, M. T. Rethinking single neuron electrical compartmentalization: dendritic contributions to network computation in vivo. Neuroscience 489, 185–199 (2022).
Article CAS PubMed Google Scholar
Xu, N. et al. Nonlinear dendritic integration of sensory and motor input during an active sensing task. Nature 492, 247–251 (2012).
Article ADS CAS PubMed Google Scholar
Abs, E. et al. Learning-related plasticity in dendrite-targeting layer 1 interneurons. Neuron 100, 684–699.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Palmer, L. M. et al. The cellular basis of GABA_B-mediated interhemispheric inhibition. Science 335, 989–993 (2012).
Article ADS CAS PubMed Google Scholar
Keller, A. J., Roth, M. M. & Scanziani, M. Feedback generates a second receptive field in neurons of the visual cortex. Nature 582, 545–549 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Jiang, X., Wang, G., Lee, A. J., Stornetta, R. L. & Zhu, J. J. The organization of two new cortical interneuronal circuits. Nat. Neurosci. 16, 210–8 (2013).
Article CAS PubMed PubMed Central Google Scholar
Reinhold, K. et al. Striatum supports fast learning but not memory recall. Nature 643, 458–467 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Meulemans, A. et al. Credit assignment in neural networks through deep feedback control. In Advances in Neural Information Processing Systems 34 (eds Ranzato, M. et al.) 4674–4687 (Curran Associates, 2021).
Todorov, E. & Jordan, M. I. Optimal feedback control as a theory of motor coordination. Nat. Neurosci. 5, 1226–1235 (2002).
Article CAS PubMed Google Scholar
Athalye, V. R., Carmena, J. M. & Costa, R. M. Neural reinforcement: re-entering and refining neural dynamics leading to desirable outcomes. Curr. Opin. Neurobiol. 60, 145–154 (2020).
Article CAS PubMed Google Scholar
Gershman, S. J. et al. Explaining dopamine through prediction errors and beyond. Nat. Neurosci. 27, 1645–1655 (2024).
Article CAS PubMed Google Scholar
Kasahara, K., DaSalla, C. S., Honda, M. & Hanakawa, T. Basal ganglia–cortical connectivity underlies self-regulation of brain oscillations in humans. Commun. Biol. 5, 712 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mikulasch, F. A., Rudelt, L., Wibral, M. & Priesemann, V. Where is the error? Hierarchical predictive coding through dendritic error computation. Trends Neurosci. 46, 45–59 (2023).
Article CAS PubMed Google Scholar
Branco, T. & Häusser, M. The single dendritic branch as a fundamental functional unit in the nervous system. Curr. Opin. Neurobiol. 20, 494–502 (2010).
Google Scholar
Marques, T. et al. A role for mouse primary visual cortex in motion perception. Curr. Biol. 28, 1703–1713.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
You 游文愷, W.-K. & Mysore, S. P. Dynamics of visual perceptual decision-making in freely behaving mice. eNeuro 9, 0161-21.2022 (2022).
Article Google Scholar
Guizar-Sicairos, M., Thurman, S. T. & Fienup, J. R. Efficient subpixel image registration algorithms. Opt. Lett. 33, 156–158 (2008).
Article ADS PubMed Google Scholar
Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997).
Article CAS PubMed Google Scholar
Pachitariu, M. et al. Suite2p: beyond 10,000 neurons with standard two-photon microscopy. Preprint at bioRxiv https://doi.org/10.1101/061507 (2017).
Keemink, S. W. FISSA: a neuropil decontamination toolbox for calcium imaging signals. Sci. Rep. 8, 3493 (2018).
Article ADS PubMed PubMed Central Google Scholar
Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S. S. & Sundararajan, S. A dual coordinate descent method for large-scale linear SVM. in Proc. 25th International Conference on Machine Learning 408–415 (ACM, 2008).

Download references

Acknowledgements

We thank I. Fiete, R. Naud and C. Yaeger for comments on the manuscript. V.F. was supported by the Y. Eva Tan Postdoctoral Fellowship. V.D.T. was supported by the MathWorks Science Fellowship and the Janet and Sheldon Razin Fellowship. E.H.S.T. was supported by the National Institute of General Medical Sciences (T32GM007753), the Paul and Daisy Soros Fellowship, and the Yang ICoN Graduate Fellowship. M.T.H. was supported by the NIH (R01NS106031, R01NS113079 and R01MH135141) and by the Klingenstein–Simons Fellowship, the Vallee Foundation Scholars and the McKnight Scholars programmes.

Author information

These authors contributed equally: Vincent D. Tang, Enrique H. S. Toloza, Zilan Ding

Authors and Affiliations

McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
Valerio Francioni, Vincent D. Tang, Enrique H. S. Toloza, Zilan Ding, Norma J. Brown & Mark T. Harnett
Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
Valerio Francioni, Vincent D. Tang, Zilan Ding, Norma J. Brown & Mark T. Harnett
Department of Physics, MIT, Cambridge, MA, USA
Enrique H. S. Toloza
Harvard Medical School, Boston, MA, USA
Enrique H. S. Toloza

Authors

Valerio Francioni
View author publications
Search author on:PubMed Google Scholar
Vincent D. Tang
View author publications
Search author on:PubMed Google Scholar
Enrique H. S. Toloza
View author publications
Search author on:PubMed Google Scholar
Zilan Ding
View author publications
Search author on:PubMed Google Scholar
Norma J. Brown
View author publications
Search author on:PubMed Google Scholar
Mark T. Harnett
View author publications
Search author on:PubMed Google Scholar

Contributions

V.F. conceptualized and designed the experimental approach, designed the BCI, habituated and performed surgery on mice, collected the data, conceptualized and implemented the analyses, prepared the figures and wrote the manuscript. V.D.T. helped in the conceptualization of data analysis, building the data analysis pipeline and writing the manuscript. E.H.S.T. helped to write the manuscript. Z.D. helped in performing surgeries on mice and with data acquisition. N.J.B. performed surgeries on mice. M.T.H supervised all aspects of the project.

Corresponding author

Correspondence to Mark T. Harnett.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Examples of a field of view with P+ and P- neurons.

Examples of a field of view with the same, GCaMP7f-labelled, chronically-tracked P+ and P- neurons (5 each, shown in red and blue, respectively), imaged at the proximal trunk (as a proxy for soma) during 14 days of learning, shown at day 1, 5, 9 and 14. P+ and P- neurons were selected on day 1. Scale bar represents 50 µm.

Extended Data Fig. 2 Task performance for individual animals.

Task performance evaluated using two metrics: accuracy (the fraction of successful trials) and rewards per minute for each of 6 mice. Summary graph and statistics in Fig. 1g. The grey dashed line represents chance accuracy.

Source data

Extended Data Fig. 3 Visual stimulus and behavioral correlates of learning.

a, Mean visual stimulus angle across 14 days of learning. Due to binning, stimuli below the 60-degrees threshold were presented as 45-degrees. b, Rotation speed across the 14 days of learning. Rotations towards and away from a reward were defined as positive and negative rotations, respectively. c, Mean frequency at which a new orientation was presented. d, Mean duration of successful trials across the 14 days of learning. e, Pearson’s correlation between licking frequency and stimulus angle (outside reward periods) during active (closed-loop) and passive stimulus presentation at the beginning and the end of learning (Two-way repeated measures ANOVA, p = 0.13, 1.4e⁻¹¹ and 0.028 for the effect of state (active vs passive), days, and interaction between state and days, respectively. After two-stage Benjamini, Krieger and Yekutieli correction for false discovery rate, p = 8.5e⁻⁴ for early-active vs late-active, 3.4e⁻⁴ for early-active vs early-active (shuffle), 8.4e⁻¹¹ for late-active vs late-active(shuffle), 1.7e⁻⁴ for early-active vs early passive, 5e⁻¹¹ for late-active vs late-passive, 0.83 for early-passive vs late-passive, 0.8 for early-passive vs early-passive(shuffle), 0.81 for late-passive vs late-passive (shuffle). N = 15 for all conditions). Error bars and shaded areas are s.e.m.. f, In black, the mean and standard error of the mean for the distance run by all animals in individual imaging sessions. All sessions were of equal length (41.26 min or 76500 imaging frames (2 planes at 30 Hz)). In grey, the distance run by individual mice (One-way repeated measures ANOVA, p = 0.58, n = 6 across 14 days). g, After excluding all timeouts, grey, black and reward-related periods, we divided our trials into 2 seconds bins and estimated whether the net rotation of the stimulus (the mean of the derivative angle) was positive (left panel, towards the reward, error reduction) or negative (right panel, away from the reward, error increase) in each bin and calculated the average distance run during error reduction and error increase epochs. We found no differences neither within condition across days, nor among conditions (Two-way repeated measures ANOVA, p = 0.15, 0.86 and 0.14 for the effects of error type, days and error type and days interaction n = 6 animals across 14 days). Error bars represent s.e.m.

Source data

Extended Data Fig. 4 Changes in GCaMP transient frequency over days depend on day 1 frequency.

a, Event frequency for P+, P- and P₀ neurons on day 1. P+ and P- neurons were selected to be of higher activity on day 1 compared to other neurons in the field of view (One-way ANOVA, p = 2.52e⁻³⁰. After Tukey’s multiple comparisons correction, p = 0.35, 9.57e⁻¹⁰ and 9.56e⁻¹⁰ for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. mean = 0.072, 0.063 and 0.028; s.e.m. = 0.006, 0.005 and 5.6e⁻⁴; n = 27, 27 and 1839 for P+, P- and P₀, respectively). b, Distribution of event frequency for P₀ neurons on day 1. The mean event frequency for P+ and P- neurons (purple, dashed line) fall on the 8^th percentile of the P₀ neurons distribution. c, Calcium transient frequency for P₀ neurons divided in tertiles based on activity on day 1, across the 14 days of training normalized to the activity on day 1. All neurons were tracked over the full 14 days of imaging (n = 6 mice). Shading represents s.e.m. d, Calcium transient frequency for P+, P- and P₀ neurons. Only P₀ neurons with the same calcium transient frequency as P+ and P- neurons on day 1 (neurons within the 95% confidence interval of the joint P+ and P- distribution) were selected. Neurons were tracked across the 14 days of imaging and activity was normalized to the activity on day 1 (Two-way repeated measures ANOVA, p = 0.008, p = 0.006 and p = 0.003 for the effect of population identity, days and an interaction between these 2, respectively. n = 6 mice across 14 days. Mean values and s.e.m. are shown). e, Calcium transient frequency for P+, P- and P₀ on days 10 to 14, normalized to day 1 (One-way repeated measures ANOVA, p = 1e⁻⁷. After Tukey’s multiple comparison, p = 3.3e⁻⁶, 5.47e⁻⁴ and 0.02 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively; mean = 0.85, 0.60 and 0.71, s.e.m. = 0.037, 0.037 and 0.02 for P+, P- and P₀, respectively. n = 6 mice).

Source data

Extended Data Fig. 5 Testing the assumption of linearity between somatic and dendritic event magnitude.

a, Representative example of a single neuron whose events were isolated and their magnitude estimated in the soma and dendrites, plotted against each other. Axes are normalized to the largest event. To describe the relationship between somatic and dendritic event magnitude, we fit a linear model. b, Same as in a, except this time we fit an exponential model to the data. c, Same as in a and b, this time however we fit a logarithmic model to the data. d, For all neurons, we calculated the Akaike information criteria (AIC) to describe the goodness of a model (while penalizing for the number of parameters used). Lower values of AIC mean better fit. AIC values were zeroed compared to the linear fit model. Error bars represent standard error of the mean (One-way repeated measures ANOVA; p < 1e⁻¹⁵, p < 1e⁻¹⁵, p = 0.56 for Linear vs Exponential, Linear vs Logarithmic and Exponential vs. Logarithmic after Tukey’s multiple comparison correction. Mean = 0, 13.8 and 12.1; s.e.m. = 0, 1.11 and 1.17; n = 543 neurons for Linear, Exponential and logarithmic fit, respectively).

Source data

Extended Data Fig. 6 SD residual analysis using ΔF/F₀-based event size estimation.

a, Same as in Fig. 2i. Our ΔF/F₀-based approach to estimate event size (see Methods and Supplementary Fig. 2) decorrelated the SD residual from somatic magnitude. b, Same as Supplementary Fig. 1. Linear models best describe the relationship between somatic and dendritic event magnitudes. For all neurons, we calculated the Akaike information criteria (AIC). Lower values of AIC mean a better fit. AIC values were zeroed compared to the linear fit model. Error bars represent standard error of the mean (One-way repeated measures ANOVA; p = 3e⁻¹⁰, p = 1e⁻¹⁰, p = 1e⁻¹⁰ for Linear vs Exponential, Linear vs Logarithmic and Exponential vs. Logarithmic after Tukey’s multiple comparison correction. Mean = 0, 12.4 and 20.1; s.e.m. = 0, 0.9 and 1.4; n = 543 neurons for Linear, Exponential and logarithmic fit, respectively). c, Same as Fig. 2g. Pearson correlation between decoder performance (linear SVM) and the correlation between SD residual and the distance from the hyperplane (or SVM classification confidence). Each dot represents one neuron (Pearson correlation = 0.73. p = 2.5e⁻⁷⁸). d, Same as Fig. 2h, the proportion of neurons with a statistically-significant (alpha = 0.05) correlation between the SD residual and the distance from the hyperplane (or classification confidence, Wilcoxon signed rank test (p = 7.3e⁻⁴. N = 466 neurons). e, Same as Fig. 2j. Decoder performance (linear SVM) for statistically-significant neurons in d (paired t-test, p = 8.3e⁻¹⁰. Mean = 0.61 and 0.51; s.e.m. = 0.008 and 0.01; n = 66). f, Same as Fig. 2k. Pearson correlation between the SD residual and the distance from hyperplane for statistically-significant neurons (paired t-test, p = 1.6e⁻¹⁵. Mean = 0.27 and −9.6e⁻⁴; s.e.m. = 0.01 and 0.02; n = 66). g, Same as Fig. 2m. Pearson correlation between relative event timing in soma and dendrites and SD residual (residual from best fit). A negative correlation means that the larger the residual (more dendritically-amplified) the earlier the peak timing in the dendrite compared to the soma, paired t-test, p = 9.7e⁻⁴⁰. Mean = 0.13 and 0.005; s.e.m. = 0.007 and 0.006; n = 466). h, i, Same as Fig. 4e (left panel, paired t-test, p = 4.3e⁻⁹. Mean = 0.61 and 0.52; s.e.m. = 0.01 and 0.01, respectively for test and shuffle data. N = 83 sessions) and 4 g (right panel, paired t-test, p = 2.4e⁻⁴. Mean = 0.55 and 0.50; s.e.m. = 0.01 and 0.01, respectively for test and shuffle data. N = 83 sessions) respectively, using a ΔF/F₀-based metrics for estimating the size of the SD residual. j, Same as Fig. 4c. During error reduction epochs, dendrites of P+ neurons are relatively amplified compared to the dendrites of P- neurons (t-test, p = 1.4e⁻⁵; Mean = 0.01 and −0.11; s.e.m. = 0.02 and 0.02; n = 292 and 240 for P+ and P- neurons, respectively). k, Same as Fig. 4d. During error reduction epochs, dendrites of P+ neurons are relatively attenuated compared to the dendrites of P- neurons (t-test, p = 3.1e⁻⁶; Mean = −0.07 and 0.04; s.e.m. = 0.02 and 0.01; n = 267 and 249 for P+ and P- neurons, respectively).

Source data

Extended Data Fig. 7 Decoding SD residuals using network activity in a single frame preceding event onset.

a, Decoder performance as a function of the correlation between SD residuals and hyperplane distance (Pearson’s r = 0.77; p-value = 7.1e⁻⁹⁴, n = 466 neurons). Data points represent individual neurons. b, Distribution of p-values for test data and a control randomly shuffled distribution, testing the correlation between SD residuals and distance from the hyperplane (or classification confidence, Wilcoxon signed rank test = p = 2.8e⁻⁹. N = 466 neurons). c, Decoding performance for neurons with a statistically significant correlation between SD residual and distance from the hyperplane (paired t-test, p = 8.6e⁻²⁸. Mean = 0.61 and 0.49; s.e.m. = 0.006 and 0.007; n = 99). Dashed grey line indicates chance level. d, Pearson’s r for neurons with a statistically-significant correlation between SD residual and the distance between population vector and hyperplane (paired t-test, p = 7.9e⁻³⁰. Mean = 0.27 and −0.005; s.e.m. = 0.01 and 0.01; n = 99).

Source data

Extended Data Fig. 8 Differences in somatic and dendritic magnitudes for coincident events are predicted by local network dynamics in P₀.

a, Decoder performance as a function of the correlation between SD residuals and hyperplane distance (Pearson’s r = 0.75; p-value < 2.2e⁻³⁰⁸, n = 7381 neurons). Data points represent individual neurons. b, Distribution of p-values for test data and a control randomly shuffled distribution, testing the correlation between SD residuals and distance from the hyperplane (or classification confidence, Wilcoxon signed rank test = p = 5e⁻⁹⁶. N = 7381 neurons). c, Pearson’s r for neurons with a statistically-significant correlation between SD residual and the distance between population vector and hyperplane (paired t-test, p < 2.2e⁻³⁰⁸. Mean = 0.31 and −0.01; s.e.m. = 0.003 and 0.005). d, For all neurons, the Pearson’s r for somato-dendritic residuals and somatic event magnitude. e, Pearson correlation value between the SD residual the event latency between soma and its corresponding dendrite indicating that the larger the SD residual, the earlier the dendritic peak is compared to the somatic one (paired t-test, p = 1.48e⁻¹⁶⁹. Mean = −0.077 and −6.9e⁻⁴; s.e.m. = 0.002 and 0.001; n = 7381 neurons). f, Decoding performance for neurons with a statistically significant correlation between SD residual and distance from the hyperplane (paired t-test, p = 1.1e⁻²⁵⁴. Mean = 0.63 and 0.50; s.e.m. = 0.002 and 0.002). Dashed grey line indicates chance level.

Source data

Extended Data Fig. 9 Orientation preferences of decoded neurons.

a, Orientation preference index, defined as (R₉₀ – R₀)/ (R₉₀ + R₀) where R₉₀ and R₀ are the inferred spike rates at 90- and 0-degrees angles, respectively during passive visual stimulus presentation for P+, P- and P₀ (One-way ANOVA, p = 0.09; After Tukey’s multiple comparisons p = 0.25, 0.98 and 0.07 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = −0.005, 0.02 and −0.007; s.e.m. = 0.01, 0.01 and 002; n = 268, 198 and 7381 for P+, P- and P₀, respectively). b, Orientation preference index for all neurons with a statistically-significant Pearson’s correlation between the SD residual and the network distance from the maximally-separating hyperplane (t-test, p = 0.31. Mean = −0.002 and −0.007; s.e.m. = 0.004 and 0.002; n = 1298 and 6547 for significant and non-significant neurons, respectively).

Source data

Extended Data Fig. 10 LED illumination alone cannot explain changes to somatic event rate, SD residual, or decoding of task-related variables.

a,b, Viral targeting and imaging/LED stimulation strategy schematic for the activation of NDNF+ interneurons in 4 mice (56 imaging sessions), and for a control group of 9 Rbp4 mice (42 imaging sessions) expressing GCaMP7f in layer 5 neurons and exposed to LED light stimulation without expressing ChRmine in NDNF+ interneurons. c, Same as Fig. 3g, but for NDNF activation experiments. White dashed lines indicate the somatic and dendritic field of view as shown in d. Scale bar = 50 μm d, Somatic and dendritic field of view as indicated by the dashed lines in a. Layer 5 neurons labelled with GCaMP7f. The two images show the artefact produced by the simultaneous imaging and LED illumination for optogenetic stimulation. LED light activates for 6 ms at the beginning of each frame while PMTs stay off for 1 extra ms (7 ms total PMT off time). After the first 7 ms, the LED light turns off while PMTs engage to record GCaMP7f activity for the remainder of the frame. Scale bar = 100 μm. e, Probability distributions of the SD residual for the two groups of mice. For each neuron, the relationship between somatic and dendritic activity was first established during the opto-off condition, and then all events during opto-on were superimposed onto this relationship. An SD residual of 0 means that there is no difference in SD residual during LED light on vs LED light off (t-test, p = 3.37e⁻¹³⁶. Mean = −0.1 and −0.9; s.e.m. = −0.02 and 0.02; n = 2087 and 1886 neurons for ChRmine + and ChRmine – groups, respectively). f, Somatic event rate in the groups mice estimated as event rate (Hz) during opto on – opto off (t-test, p = 1.6e⁻²⁹⁷. Mean = −0.003 and −0.02; s.e.m. = −0.0002 and 0.0004; n = 3481 and 2884 neurons for ChRmine +ve and ChRmine –ve groups, respectively). g, Same as Fig. 4j (left panel, paired t-test, p = 0.001. Mean = 0.62 and 0.57; s.e.m. = 0.02 and 0.02, respectively for test and shuffle data. N = 64 sessions from 5 mice) and for a control group of 5 Rbp4 mice expressing GCaMP7f in layer 5 neurons and exposed to LED stimulation without expressing ChRmine in NDNF+ interneurons. h, Same as Fig. 4l (right panel, paired t-test, p = 0.001. Mean = 0.58 and 0.52; s.e.m. = 0.01 and 0.001, respectively for test and shuffle data. N = 65 sessions in 5 mice) for a control group of 5 Rbp4 mice expressing GCaMP7f in layer 5 neurons and exposed to LED stimulation without expressing ChRmine in NDNF+ interneurons.

Source data

Extended Data Fig. 11 Dendritic reward signals are cell-specific during periods of reward anticipation but not during reward consumption.

a, SD residuals in the two seconds following target reach for rewarded trials (left) and the end of a trial for rewarded and unrewarded trials (right), respectively, for P+, P- and P₀ neurons (rewarded epochs, One-way repeated measures ANOVA, p = 0.5. After Tukey’s multiple comparison p = 0.75, p = 0.93 and p = 78 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = 0.01, 0.06 and 0.02; s.e.m. = 0.02, 0.06 and 0.01 for P+, P- and P₀, respectively. Unrewarded epochs, One-way repeated measures ANOVA, p = 0.47. After Tukey’s multiple comparison p = 0.65, p = 0.99 and p = 0.57 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = 0.03, 0.06 and 0.02, s.e.m. = 0.04, 0.04 and 0.01 for P+, P- and P₀, respectively. n = 84 sessions). b, SVM weights for decoding rewarded vs unrewarded trials after the end of a trial for P+, P- and P₀ neurons. Here, the rewarded side of the hyperplane was arbitrarily assigned as the positive side, while the unrewarded one was assigned as the negative one (One-way repeated measures ANOVA, p = 0.72. After Tukey’s multiple comparison p = 0.87, p = 0.99 and p = 0.71 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = −0.03, −0.1 and −0.03; s.e.m. = 0.05, 0.07 and 0.2 for P+, P- and P₀, respectively. n = 84 sessions). c, SD residual in the two seconds preceding target reach for rewarded trials (left) and in the two seconds preceding the end of a trial for unrewarded trials (right), for P+, P- and P₀ neurons (rewarded epochs, One-way repeated measures ANOVA, p = 2e⁻⁴. After Tukey’s multiple comparison p = 5.6e⁻⁴, p = 4.8e⁻³ and p = 0.035 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = 0.09, −0.11 and 0.01, s.e.m. = 0.03, 0.05 and 0.01 for P+, P- and P₀, respectively. Unrewarded epochs, One-way repeated measures ANOVA, p = 0.83. After Tukey’s multiple comparison p = 0.92, p = 0.99 and p = 0.9 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = −0.01, −0.04 and −0.01, s.e.m. = 0.04, 0.04 and 0.01 for P+, P- and P₀, respectively. n = 84 sessions). d, SVM weights for decoding rewarded vs unrewarded trials for P+, P- and P₀ neurons. As in b, the rewarded side of the hyperplane was arbitrarily assigned as the positive side, while the unrewarded one was assigned as the negative one (One-way repeated measures ANOVA, p = 4.1e⁻⁴. After Tukey’s multiple comparison p = 3.8e⁻³, p = 5.6e⁻³ and p = 0.18 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = 0.15, −0.02 and 0.03, s.e.m. = 0.04, 0.03 and 0.01 for P+, P- and P₀, respectively. n = 84 sessions).

Source data

Extended Data Fig. 12 Four additional example neurons (2 P+ and 2 P-) showing different somato-dendritic relationships during error decrease and error increase epochs.

For two P+ and two P- neurons, mean ΔF/F0 signal in the soma (black) and in the dendrite (orange) are shown for all events that occurred during epochs of error reduction and error increase, respectively. The bar graph represents the mean SD residual value (z-scored) for all events occurred during error decrease and increase epochs. Shaded areas and error bars represent s.e.m..

Extended Data Fig. 13 Dendritic error signals cannot be explained away by differences in somatic event magnitude, do not represent absolute error values, and are consistent across mice.

a, b, The cumulative distribution function for SD residuals (z-scored per neuron) in P+ (red) and P- (blue) neurons during error decrease in a (t-test, p = 3.5e⁻⁶, mean = −0.003 and −0.16; s.e.m. = 0.015 and 0.03; n = 211 and 179 for P+ and P-) and error increase in b (t-test, p = 4.6e⁻⁶, mean −0.1 and 0.05; s.e.m. = 0.03 and 0.02; n = 211 and 179 for P+ and P- neurons) epochs. Only neurons with the same somatic response to error decrease and increase were selected for this analysis. c, The cumulative distribution function for SD residuals (z-scored per neuron) in P+ (red) and P- (blue) neurons during epochs where the absolute error was smaller than 45-degrees (t-test, p = 0.16; mean = −0.03 and 0.07; s.e.m. = 0.017 and 0.02; n = 290 and 244 for P+ and P- neurons, respectively). d, Same as a, but for epochs in which absolute error was larger than 45-degrees (t-test, p = 0.47; mean −0.03 and −0.006; s.e.m. = 0.02 and 0.01; n = 278 and 249 for P+ and P- neurons, respectively). e. For all mice (N = 6) the separability of dendritic signals, defined as the SD residual during epochs of error decrease minus the SD residual during epochs of error increase, for P+ (in red), P- (in blue), and overall dendritic separability ((P+) – (P-)) (Paired t-test, p = 0.03, p = 0.006 and p = 0.005; Mean = 0.08, −0.15 and 0.23; s.e.m. = 0.03, 0.03 and 0.05 for P+, P- and (P+) – (P-), respectively. N = 6 animals. Note that comparing P+ vs. P- or (P+) – (P-) vs. 0 is statistically equivalent).

Source data

Extended Data Fig. 14 P₀ neurons which are functionally-correlated to P+ and P- neurons also receive vectorized dendritic error signals.

a, Across all sessions, we divided P₀ neurons into (P+)- and P(-)-like based on their correlation to the (P+)-(P-) subtracted signal we used to rotate the visual stimulus. Among all P₀ neurons (n = 9796) we defined (P+)- and (P-)-like P₀ neurons as those with a correlation above 1 and below −1 standard deviation. b, Left panel, during error reduction epochs, bar graph comparing the SD residual (z-scored) in P+, P- and P₀ neurons (One-way ANOVA, p = 1.6e⁻⁴. After Tukey’s multiple comparisons, p = 0.006, p = 1 and p = 8.9e⁻⁵ for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = 0.004, −0.13 and 0.007; s.e.m. = 0.02, 0.02 and 0.01; n = 292, 240 and 8335 for P+, P- and P₀, respectively). Right panel, during error reduction epochs, bar graph comparing the SD residual (z-scored) in (P+)- and P(-)-like P₀ neurons (t-test, p = 0.025; mean = 0.04 and −0.02; s.e.m. = 0.02 and 0.02; n = 1059 and 1110 for (P+)- and P(-)-like P₀ neurons, respectively). c, Left panel, during error increase epochs, bar graph comparing the SD residual (z-scored) in P+, P- and P₀ neurons (One-way ANOVA, p = 0.002. After Tukey’s multiple comparisons, p = 0.004, p = 0.45 and p = 0.003 for P+ vs P-, P+ vs P₀ and P- vs P₀, respectively. Mean = −0.1, 0.05 and 0.01; s.e.m. = 0.02, 0.02 and 0.01; n = 267, 249 and 8356 for P+, P- and P₀, respectively). Right panel, during error increase epochs, bar graph comparing the SD residual (z-scored) in (P+)- and P(-)-like P₀ neurons (t-test, p = 0.010; mean = −0.05 and 0.01; s.e.m. = 0.02 and 0.01; n = 1070 and 1099 for (P+)- and P(-)-like P₀ neurons, respectively).

Source data

Extended Data Fig. 15 LED illumination alone is not responsible for the abolishment of vectorized error signals and impaired learning.

a, Experimental schematic: P+ and P- neurons were recorded during the activation of NDNF+ interneurons (N = 4 mice) (a_i) and in a control group of 5 Rbp4 mice expressing GCaMP7f in layer 5 neurons and exposed to LED stimulation without expressing ChRmine in NDNF+ interneurons (a_ii). b, Upper panel (b_i) Same as Fig. 5g. The activation of L1 NDNF+ neuron leads to the abolishment of vectorized error signals in P+ and P- neurons. Left, during error reduction epochs, the cumulative distribution function for SD residuals (z-scored across all neurons) for P+ (in red) and P- (in blue) neurons (t-test; p = 0.58; mean = 0.06 and 0.02; s.e.m. = 0.04 and 0.04; n = 119 and 100 for P+ and P- neurons respectively from 4 mice). Right panel, for error increase epochs (t-test, p = 0.7; mean = − 0.02 and 0; SEM = 0.05 and 0.03; n = 105 and 105 for P+ and P- neurons respectively). Lower panel, the activation of the LED light in Rbp4-Cre animals expressing GCaMP7f in layer 5 neurons, but not ChRmine in NDNF+ interneurons (b_ii). Left, during error reduction epochs, the cumulative distribution function for SD residuals (z-scored across all neurons) for P+ (in red) and P- (in blue) neurons (t-test; p = 8.2e⁻⁷; mean = 0.08 and −0.1; s.e.m. = 0.03 and 0.03; n = 206 and 171 for P+ and P- neurons respectively from 5 mice). Right panel, for error increase epochs (t-test, p = 5.9e⁻⁴; mean = −0.06 and 0.1; s.e.m. = 0.04 and 0.03; n = 182 and 177 for P+ and P- neurons respectively). c,d, BCI performance (accuracy in c and rewards per minute in d) for the last 6 days of training (late training) in the two groups of animals: the ones expressing ChRmine in NDNF+ interneurons (n = 4) and their ChRmine-negative LED illumination counterparts (n = 5) (for accuracy in c, t-test; p = 3.1e⁻⁴; mean = 0.56 and 0.71; s.e.m. = 0.04 and 0.02; n = 24 and 30 for NDNF+ neurons activation and LED control, respectively; paired t-test, p = 0.1 for NDNF+ neurons activation vs chance and p = 2e⁻¹¹ for LED control vs chance. For rewards/minute in d, t-test; p = 2.6e⁻⁴; mean = 1.9 and 3.1; s.e.m. = 0.2 and 0.3; n = 24 and 30 for NDNF+ neurons activation and LED control, respectively. paired t-test, p = 0.22 for NDNF+ neurons activation vs control day 1 and p = 6.8e⁻⁹ for LED control vs control day 1).

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–3

Reporting Summary

Source data

Source Data Figs. 1–5 and Extended Data Figs. 2–11 and 13–15.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Francioni, V., Tang, V.D., Toloza, E.H.S. et al. Vectorized instructive signals in cortical dendrites. Nature (2026). https://doi.org/10.1038/s41586-026-10190-7

Download citation

Received: 02 December 2022
Accepted: 23 January 2026
Published: 25 February 2026
Version of record: 25 February 2026
DOI: https://doi.org/10.1038/s41586-026-10190-7

Subjects

Abstract

Main

Specifying a reward function using a BCI task

Dendrites contain information not found in their somas

Experimental perturbation of SD residuals

SD residuals decode reward and trial outcome

SD residuals reflect neuron-specific task error signals

Discussion

Methods

Animals

Surgery

Two-photon imaging

Optogenetic stimulation

BCI task

P+ and P− neuron selection

Online motion-correction

Visual stimuli

Offline image analysis and signal extraction

Field of view matching and ROI registration across days

Quantification of event frequency, magnitude and timing

Decoders

Statistics

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links