Introduction

The vast array of diverse brain functions arises from irregular spiking across neural populations, observable on recording electrodes in experiments. Yet, most current hypotheses about neural computation employ the firing rate, an abstract mathematical quantity describing the propensity of a neuron to spike. For example, prevalent theories suggest that neurons change their firing rate to signal features of sensory stimuli1,2 or parameters of body movements3. Moreover, complex cognitive computations emerge from the dynamics of latent variables tracing trajectories through a state space, where each axis corresponds to the firing rate of one neuron in the population4,5,6,7,8,9. Since the firing rate is a latent quantity not directly observable in experiments, testing any of these theories requires relating the firing rate to experimentally measured spikes. Specifying this relationship effectively partitions the total variability in the spiking output of a neuron into two components: changes in the firing rate and irregularity of the process generating spikes from this firing rate10,11.

The traditional partitioning method defines the firing rate as the average of spikes over repeated trials under the same experimental conditions, thus attributing any trial-to-trial variability entirely to the irregular spike generation in single neurons1,2,4. This definition implies that the firing rate is the same on each trial and deterministically locked to experimentally controlled events, e.g., stimulus or movement onset. However, neural responses fluctuate significantly across trials due to many factors beyond experimental control, including changes in behavioral state12,13,14 and endogenous network dynamics10,15,16. Capturing the dynamics of these latent factors requires a partitioning framework that accounts for the firing rate fluctuations on single trials.

The most common approach to account for the firing rate fluctuations is to assume an inhomogeneous Poisson process as a model for spike generation on single trials11. In this model, spikes occur independently of each other, with the probability of spiking at each moment controlled by the instantaneous firing rate17. The mathematical convenience of the inhomogeneous Poisson process led to its widespread application in existing methods for inferring the dynamics of firing rates and associated latent variables on single trials18,19,20,21,22,23,24,25. However, the spiking of neurons across many brain areas deviates significantly from the inhomogeneous Poisson assumption. For example, many neurons have a Fano factor (FF, the variance-to-mean ratio of spike counts) less than one10,26,27,28, which is the minimal FF value theoretically possible for an inhomogeneous Poisson process. Unfortunately, incorrect assumptions about the irregularity of the spiking process can lead to errors in estimating the link between changes in firing rates and behavior and to misleading conclusions when arbitrating between alternative hypotheses about single-trial dynamics of latent variables29,30. Despite these limitations, there is no alternative model for accurately partitioning spiking variability in experimental data. While few previous methods attempted to estimate spiking irregularity moving beyond the Poisson assumption10,28,31, these methods relied on heuristic assumptions that are not always true and therefore, as we show, are sensitive to nuisance parameters, e.g., firing rate and bin size.

We introduce a doubly stochastic renewal (DSR) point process, a mathematical framework for partitioning spiking variability, which accounts for firing rate fluctuations and provides a flexible model to capture the broad spectrum of spiking irregularity from periodic to super-Poisson. Using our framework, we devise a method for estimating spiking irregularity and show that it accurately recovers ground truth on simulated spike trains with a wide range of firing rate dynamics and spiking irregularity. We further validate the theoretical assumptions in our framework using intracellular recordings of membrane potential from neurons in the barrel cortex of awake mice32. We apply our approach to quantify the spiking irregularity of cortical neurons and find that spiking irregularity decreases from visual to higher sensory-motor areas, mirroring the gradient of unpartitioned variability26,27,33,34. Moreover, the spiking irregularity is nearly constant for each neuron under many conditions but can also change across task epochs. Using spiking network models35,36, we show that spiking irregularity depends on connectivity and biophysical properties of single neurons and can change with external input. Our work establishes that a DSR point process is a flexible model to capture the broad spectrum of spiking irregularity of cortical neurons and improve the precision of methods for estimating latent dynamics on single trials.

Results

We develop the DSR point process model as a flexible alternative to the standard inhomogeneous Poisson process, capable of accounting for the diverse spiking irregularity of cortical neurons. We first introduce our mathematical framework and precisely define the spiking irregularity as a parameter ϕ within the DSR model. We then present a method for estimating ϕ from data and validate its accuracy using synthetic spike trains and intracellular voltage recordings. Finally, we describe our observations of spiking irregularity in several cortical areas, which reveal systematic differences from sensory to motor areas and indicate that Poisson irregularity is a rare exception, not a rule.

Doubly stochastic renewal framework

Mathematically, a spike train is a point process, that is, a sequence of discrete events occurring randomly in time. A doubly stochastic point process generates spikes stochastically from the firing rate that also fluctuates over time and from trial to trial. To define a doubly stochastic point process, we need to specify two components: a non-negative real-valued stochastic process λ(t) for the instantaneous firing rate and a point process generating spikes from a realization of the firing rate λ(t).

The simplest point process model is the Poisson process17, in which spikes occur independently of each other with the probability λ(t)dt in any infinitesimal time interval [tt + dt]. The Poisson point process generates spikes with fixed irregularity: for a constant firing rate, the FF equals one for a time bin of any size. Since firing rate fluctuations only increase variability10, FF is always greater than one for Poisson processes with fluctuating firing rate11. Therefore, the inhomogeneous Poisson process cannot account for diverse spiking statistics across neurons, in particular, neurons with FF smaller than one10,26,27,28.

A more flexible model is a renewal point process, in which the probability of generating a spike depends on the time elapsed since the last spike17,37. Mathematically, the dependence between consecutive spike times can be described by the probability density g() of interspike intervals (ISIs). For the constant firing rate, the probability of the next spike occurring in the interval [tt + dt] is proportional to g(ttl)dt, where tl is the time of the last spike. The shape of the ISI distribution g() controls the irregularity of the renewal point process. A narrow distribution peaked around a set ISI value will produce a nearly periodic spike train, whereas an exponential distribution results in an irregular spike train equivalent to the Poisson process.

Here, we introduce a DSR point process, which we define by a pair {g(), λ(t)} via a three-step algorithm for generating spike trains (Fig. 1a). First, we sample a realization of the firing rate λ(t) for a specific trial. Next, we sample ISIs from the probability density g() to generate spikes in operational time \({t}^{{\prime} }\). We set the mean of the ISI probability density function to one: \({\mu }_{g}=\int_{0}^{\infty }\theta g(\theta )d\theta=1\) s, which implies the firing rate of the point process in operational time is \({\mu }_{g}^{-1}=1\) Hz. Finally, we map the spike times from the operational time \({t}^{{\prime} }\) to the real time t by locally squeezing and stretching time in proportion to the inverse cumulative firing rate \(t={\Lambda }^{-1}({t}^{{\prime} })\), where

$${t}^{{\prime} }=\Lambda (t)={\int}_{\!\!\!\!0}^{t}\lambda (s)ds\,.$$
(1)

This transformation ensures that the spike density in real time follows the instantaneous firing rate λ(t)38,39.

Fig. 1: Doubly stochastic renewal point process model.
figure 1

a We define a doubly stochastic renewal point process by a pair {g(), λ(t)}. A neuron-specific function g() is the ISI probability density in the operational time \({t}^{{\prime} }\) which controls the irregularity of the renewal point process, that is, the variability of spike generation (lower left, blue). A stochastic process λ(t) defines the dynamics of firing rate on single trials and controls the trial-to-trial firing rate fluctuations (upper left, green). The pair {g(), λ(t)} defines the spike generation process via a three-step algorithm. First, a realization of the firing rate λ(t) for a specific trial is sampled from the process λ(t) (upper center, green—λ(t) for a specific trial, gray—λ(t) for multiple trials). Second, spike times are sampled in the operational time from the ISI probability density function g() (lower center, blue ticks). Since the mean of the ISI distribution g() is set to μg = 1 s, the mean firing rate of spikes in the operational time is 1 Hz. Third, the spikes are mapped from the operational time \({t}^{{\prime} }\) to the real time t via \(t={\Lambda }^{-1}({t}^{{\prime} })\), where the map is defined by the cumulative firing rate function \({t}^{{\prime} }=\Lambda (t)=\int_{0}^{t}\lambda (s)ds\) (right, green line). b Examples of diverse spiking activity generated from a doubly stochastic renewal model. We consider firing rate λ(t) that is constant within and across trials (upper row) or follows a drift-diffusion process with sticky boundaries on single trials (lower row). From each realization of the firing rate λ(t) (color gradient, first column), we generate spikes using the doubly stochastic renewal model in which g() is a gamma distribution with ϕ = 0.3 (sub-Poisson, second column), ϕ = 1 (Poisson, third column), and ϕ = 1.7 (super-Poisson, fourth column). Differences in firing rate variability and spiking irregularity ϕ are difficult to discern from the resulting diverse patterns of spiking activity.

In our framework, the spiking irregularity is defined by the ISI distribution g() in the operational time, which controls spiking irregularity independently of the firing rate. In particular, using the same distribution g() (the same spiking irregularity), our DSR model can generate spike trains with high or low firing rate using different λ(t). Conversely, for the same firing rate, our DSR model can generate spike trains with high or low spiking irregularity using different distributions g(). A special case is when g() belongs to a two-parameter family of continuous probability distributions uniquely determined by its mean μg and standard deviation σg. In this case, we denote the squared coefficient of variation40 (CV2) of the distribution g() by

$$\phi \equiv \frac{{\sigma }_{g}^{2}}{{\mu }_{g}^{2}}.$$
(2)

With these assumptions, ϕ uniquely determines the distribution g(), since μg = 1 s by our definition of the DSR process. Therefore, a single parameter ϕ fully controls the spiking irregularity. For different values of ϕ and firing-rate fluctuations, our DSR model can generate a broad spectrum of spiking activity, ranging from nearly periodic to highly irregular both within and across trials (Fig. 1b).

Partitioning variability in data

In experiments, we only have access to the total spiking variability that includes contributions from both firing rate fluctuations and spiking irregularity. A common metric of the total spiking variability is the variance Var(NT) of spike count NT measured in time bins of size T. For doubly stochastic processes, the total variance Var(NT) arises from the firing rate and point process components. We assume that the instantaneous firing rate changes on a timescale τ longer than the bin size τ > T, which implies λ(t) is approximately constant λ within a bin. Then, we can use the law of total variance41 to decompose the total spike-count variance into the firing rate and point process components10:

$${{\mbox{Var}}}({{\boldsymbol{N}}}_T)=\underbrace{{{\mbox{Var}}}({{\mbox{E}}}[{{\boldsymbol{N}}}_T|{{\boldsymbol{\lambda}}}])}_{{{\rm{firing}}} \; {{\rm{rate}}} \; {{\rm{variance}}}}+\underbrace{{{\mbox{E}}}[{{\mbox{Var}}}({{\boldsymbol{N}}}_T|{{\boldsymbol{\lambda}}})]}_{{{\rm{point}}} \; {{\rm{process}}} \; {{\rm{variance}}}}.$$
(3)

Within our DSR framework, we can express the two terms in this decomposition via {g(), λ}. The first term is the firing rate variance and equals Var(E[NTλ]) = Var(λT) (Methods). The second term is the point process variance and, for moderately large bin size T > 1/E[λ], we can approximate it as (Supplementary Note 1.1):

$$\,{{\mbox{E}}}\,[\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T}| {{\boldsymbol{\lambda }}})]={\left(\frac{{\sigma }_{g}}{{\mu }_{g}}\right)}^{2}\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]+\frac{1}{6}+\frac{1}{2}\cdot {\left(\frac{{\sigma }_{g}}{{\mu }_{g}}\right)}^{4}-\frac{1}{3}\cdot \frac{{{\mu }_{3}}_{g}}{{\mu }_{g}^{3}}+{{\mathcal{O}}}({T}^{-1}).$$
(4)

Here μg, σg, and \({{\mu }_{3}}_{g}\) are the mean, standard deviation, and third central moment of the distribution g(), and \({{\mathcal{O}}}({T}^{-1})\) indicates the approximation error scaling as T−1. Since μg = 1 s by our definition of the DSR process, two parameters σg and \({{\mu }_{3}}_{g}\) control the point process variance.

Next, we introduce simplifying assumptions to develop this general theoretical result into a practical data analysis method. We consider g() to be the gamma distribution, a particular case of the two-parameter distribution family that has proven useful for modeling ISI data38,42,43. For the gamma distribution, the third central moment is given as \({{\mu }_{3}}_{g}=2{\phi }^{2}\) (Methods), and we can simplify the partitioning equation Eq. (4) to be

$$\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})=\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}}T)+\frac{1}{6}(1-{\phi }^{2})+\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]+{{\mathcal{O}}}({T}^{-1}).$$
(5)

The second and third terms in this equation express the point process variance via a single parameter ϕ. For ϕ = 1, the gamma distribution reduces to the exponential distribution, and the renewal point process is a Poisson process with variance equal to the mean spike count in Eq. (5).

In summary, we partitioned the total spike-count variance Var(NT) into the firing rate and point process components, with spiking irregularity controlled by a single parameter ϕ. To make this partition unambiguous29, we constrained the spike generation to be a renewal point process and enforced the smoothness of the firing rate.

Estimation from data

In partitioning equation Eq. (5), the spike-count mean E[NT] and variance Var(NT) can be measured directly from data, whereas ϕ and the firing rate variance Var(λT) are unknown. Thus, to partition variability in spike data, we first need to estimate ϕ. We devise an estimation method for ϕ, which we call the DSR method, based on our assumption that the firing rate changes smoothly and is approximately constant within a bin. We apply Eq. (5) to spike counts measured in two bin sizes, T and 2T, to yield two equations, which we solve to obtain a quadratic equation for ϕ (Methods):

$$\frac{1}{2}\left({\phi }_{{{\rm{DSR}}}}^{2}-1\right)-(4\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]-\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{2T}]){\phi }_{{{\rm{DSR}}}}+4\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})-\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{2T})=0\,.$$
(6)

Here, the spike-count mean and variance for each bin size E[NT], E[N2T], Var(NT), Var(N2T) are measured directly from the spike data, and ϕDSR is the only unknown variable. Thus, we can solve Eq. (6) to estimate ϕDSR from data. Then, we can separate the firing rate and point process variance using the estimated ϕDSR in Eq. (5).

We consider two criteria for selecting the bin size T used to estimate spiking irregularity. On one hand, we require T > 1/E[λ] to ensure that the partitioning Eq. (5) holds. On the other hand, the bin size should be as small as possible, given the assumption that the firing rate remains constant within each bin. To satisfy both conditions, we set T = 2/E[λ] for each neuron in all analyses (Methods). We confirmed that our estimation method remains robust across a broad range of bin sizes (Supplementary Note 1.2, Supplementary Figs. 13).

Comparison with previous estimation methods

We derive our partition equation Eq. (4) and estimation method Eq. (6) rigorously using the theory of renewal point processes17,37. Two other methods have been broadly used for estimating point process variance10,27,28,39,44,45, but these previous methods relied on several assumptions and heuristics that are not always applicable. We first theoretically analyze how these assumptions impact the estimation accuracy of previous methods. We then evaluate these previous approaches and our method on synthetic data with known ground truth.

The first method, which we refer to as the deterministic time rescaling (DTR) method39, assumes that the time-dependent firing rate λ(t) is deterministic, i.e., does not fluctuate from trial to trial. Accordingly, the firing rate is the same on each trial and can be estimated by averaging spike counts in a bin across trials \(\hat{\lambda }({t}_{i})\), which is called a peristimulus time histogram46. One can then substitute the estimated firing rate \(\hat{\lambda }({t}_{i})\) in Eq. (1) to map the spike times into the operational time \({t}^{{\prime} }\). This mapping removes the effect of the time-dependent changes in firing rate locked to experimentally controlled events, assuming that the ISI distribution in the operational time reflects only the point process variability, which corresponds to g() in our theory. Accordingly, CV2 of the rescaled ISIs provides an estimate of the spiking irregularity parameter ϕ, which we denote by \({\phi }_{{{\rm{DTR}}}}\). If the ground-truth firing rate is indeed the same on each trial, \({\phi }_{{{\rm{DTR}}}}\) converges to the ground-truth ϕ for a large trial number (Methods, Supplementary Note 1.3). However, in the presence of the trial-to-trial firing rate fluctuations, this method always overestimates the point process variability, with the error increasing for larger trial-to-trial variability in the firing rate (Methods, Supplementary Note 1.3, Supplementary Fig. 4).

The second method, to which we refer as the minimum ratio (MR) method10, allows for firing-rate fluctuations but assumes that the point process variance in Eq. (3) is proportional to the mean spike count:

$$\,{{\mbox{E}}}\,[\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T}| {{\boldsymbol{\lambda }}})]=\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}],$$
(7)

where the coefficient ϕ is a neuron-specific constant. For renewal processes, this assumption Eq. (7), is only valid in the limit T →  or in the special case when \({{\mu }_{3}}_{g}=\frac{3}{2}{\phi }^{2}+\frac{1}{2}\) (Methods, Supplementary Note 1.4). The latter condition does not hold in general; for example, when g() is a gamma distribution, it holds only for ϕ = 1, i.e., for the Poisson spike generation process (Eq. (5)). Using the ansatz Eq. (7) and the constraint that spike-count variance in Eq. (3) must be positive, the MR method then estimates ϕ as the minimum FF across all time bins:

$${\phi }_{{{\rm{MR}}}}={\min }_{t}\left\{\frac{\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T}(t))}{\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}(t)]}\right\}.$$
(8)

We show that for DSR processes, the estimation error of this method depends on the bin size T, the mean and variance of the firing rate λ(t), and on the ground truth ϕ itself (Methods, Supplementary Note 1.4). The dependence of ϕMR on all these nuisance parameters is inconsistent with the assumption that ϕ is a constant characterizing the renewal process that controls spiking irregularity. Moreover, it shows that ϕMR is affected by several sources of bias, leading to unpredictable estimation errors. Other methods for estimating ϕ in Eq. (7) with a finite bin size have similar limitations31,47 (Methods), although accurate estimation is possible using large bin sizes48.

We compared the performance of our DSR method with these previous methods on synthetic data generated from DSR point processes with known ground truth ϕ (Fig. 2). Specifically, we chose g() to be a gamma distribution, and the firing rate λ(t) either to be constant on each trial sampled from a uniform distribution \([\, \mu -\frac{w}{2},\mu+\frac{w}{2}]\) across trials or to follow a drift-diffusion process on single trials as in prominent decision-making models49. Since the DTR assumes an inhomogeneous renewal process as a generative model—a special case of our DSR model with zero trial-to-trial firing rate variability—it performs well when this variability is low, in a regime consistent with its assumptions. When trial-to-trial firing rate variability is nonzero, DTR always overestimates ϕ, with error increasing as the firing rate variability grows. Since the MR method is based on heuristics lacking a generative model, it is not possible to evaluate its accuracy in a setting consistent with its assumptions. Accordingly, the MR method can overestimate or underestimate ϕ, and the degree of bias depends on the firing-rate variability and the ground-truth ϕ. In contrast, our DSR method accurately estimates ϕ, and its accuracy is independent of the firing-rate variability and the ground-truth ϕ (Fig. 2c). Thus, the DSR method can reliably estimate ϕ for point processes across a wide range of spiking irregularity and firing rate variability.

Fig. 2: Estimation of spiking irregularity on synthetic data with known ground truth ϕ.
figure 2

a We generated synthetic data from an ensemble of doubly stochastic renewal point processes {g(), λ(t)}, where g() is the gamma distribution and the ground-truth value of ϕ ranges from 0.1 to 1. The firing rate λ(t) is either constant within a trial, sampled across trials from a uniform distribution with the width w (upper row), or a drift-diffusion process with sticky boundaries and the diffusion coefficient D (lower row). The parameters w and D control the trial-to-trial variability of the firing rate. We varied w from 10 to 30 Hz, and D from 5 to 13 Hz2/ms. b The ground-truth ϕ (x-axis) versus estimated ϕ (y-axis) for the ensemble of doubly stochastic renewal point processes with uniform (upper row) and drift-diffusion (lower row) firing rate fluctuations. The deterministic time rescaling (DTR) method always overestimates ϕ with error increasing for larger trial-to-trial variability of the firing rate (left). The minimum ratio (MR) method can underestimate or overestimate ϕ depending on the mean and variance of the firing rate, bin size, and the ground truth ϕ itself, producing unpredictable estimation errors (center). The doubly stochastic renewal (DSR) method accurately estimates ϕ in all cases (right). Every point in the scatter plots is the average over 20 simulations for a fixed ϕ and fixed D or w. Each simulation had 100 trials. c Estimation error (root-mean-square error, RMSE, across 20 simulations) for ϕ estimated by the three methods for different values of w (upper row) and D (lower row), which control the trial-to-trial variability of the firing rate. The RMSE increases with the firing rate variability for both DTR and MR methods, whereas the RMSE is consistently low and independent of the firing rate variability for our DSR method. Source data are provided as a Source data file.

Validation with intracellular voltage recordings

After confirming the accuracy of our partitioning method on synthetic data, we sought to validate our theoretical framework on neural recording data. For such validation, extracellular spike recordings are unsuitable because they do not provide an objective, independent measure of instantaneous firing rate on single trials. Instead, we use whole-cell recordings of intracellular membrane potential to estimate the instantaneous firing rate from the subthreshold voltage traces. We base our analysis on theoretical studies showing that, for a variety of leaky integrate-and-fire neuron models, the average firing rate is a power law function of the membrane potential50,51, which was confirmed experimentally in many cases52,53. Accordingly, we model the firing rate of a neuron as a deterministic function of the average subthreshold membrane potential. Using this function, we can obtain the instantaneous firing rate from the subthreshold voltage. With this independently estimated instantaneous firing rate, we can map spikes from real to operational time via time rescaling (Fig. 1a). Since ϕ in our framework is defined as \({{\mbox{CV}}}^{2}={\sigma }_{g}^{2}/{\mu }_{g}^{2}\) of the ISI distribution g() in operational time, CV2 of the rescaled ISIs provides an independent estimate of ϕ which we can compare to ϕ estimated with the DSR method from spike times alone. A good agreement between these different estimates of ϕ would indicate that our DSR framework faithfully captures the statistics of biophysical processes relating subthreshold voltage dynamics to spikes.

We analyzed a dataset of whole-cell intracellular recordings of the membrane potential in parvalbumin-positive (PV) inhibitory neurons from layer 2/3 of the barrel cortex in awake head-fixed mice32 (Methods). We analyzed eight neurons from five mice (Fig. 3a). For each neuron, we first computed the empirical relationship between the average subthreshold voltage and firing rate estimated from spike counts in 50 ms time bins (Fig. 3b, Methods). Consistent with previous studies52,53, this relationship showed a lawful monotonically increasing trend on average, which we approximated with a smooth deterministic function f(v) by fitting a spline to the data points (Fig. 3b). We assumed that the function f(v) defines the instantaneous firing rate from the subthreshold voltage on single trials and that variability of spike counts for a fixed voltage can be captured with a stochastic spike generation mechanism in our DSR framework. Thus, we applied f(v) to the subthreshold voltage to obtain the instantaneous firing rate, which we then used to map ISIs to the operational time to estimate ϕ.

Fig. 3: Validation of the DSR framework using intracellular voltage recordings.
figure 3

a Example recording of the membrane potential in a PV neuron on a single trial32. The trace contains spikes and subthreshold voltage fluctuations. b A relationship between the membrane potential and instantaneous firing rate for the example neuron in (a). For each of the 25 ms bins in the recording, we estimate the instantaneous firing rate by the ratio of the spike count to the bin size and plot it against the subthreshold membrane potential averaged over the bin after removing spikes (blue points). Averaging the data points within 1 mV voltage bins reveals a lawful relationship between the subthreshold membrane potential and the firing rate (black points, error bars are s.e.m. over the data points in each voltage bin), which can be approximated with a spline fit (black line). c To independently estimate the spiking irregularity of a neuron, we map its spikes to the operational time using the instantaneous firing rate computed from the subthreshold membrane potential at each time using the fitted voltage-to-firing-rate relationship in (b). The spiking irregularity is CV2 of ISIs in the operational time (blue) and accounts for a small fraction of the total variability \({\,{\mbox{CV}}\,}_{{{\rm{raw}}}}^{2}\) of the ISIs in the real time (yellow), which also contains the firing rate variability. For all 8 neurons in the dataset, ϕ estimated with the DSR method from spike times alone (orange) closely corresponds with the spiking irregularity estimated independently from the subthreshold voltage, validating the assumptions of the DSR framework. The dashed line indicates the Poisson irregularity value of 1. d The fraction of the total spiking variability \({\,{\mbox{CV}}\,}_{{{\rm{raw}}}}^{2}\) attributed to the spiking irregularity was similar between the DSR method (y-axis) and the partitioning based on the subthreshold voltage (x-axis). Each data point represents one neuron. For each neuron, the recording was divided into 20 segments of 10 s duration each. Dots indicate the mean of the estimates across these segments, and error bars show the standard error of the mean (SEM). Source data are provided as a Source data file.

We observed that the spiking irregularity estimated from the subthreshold voltage was consistent with our theoretical DSR framework. First, the mean \({\hat{\mu }}_{g}\) of the rescaled ISIs was close to 1 (\(| {\hat{\mu }}_{g}-1| < 0.031\) for all neurons), validating the condition μg ≈ 1 s in the DSR framework. This agreement indicates that the subthreshold voltage transformed via f(v) is a good proxy for the instantaneous firing rate in the DSR framework. Second, ϕ estimated from the subthreshold voltage corresponded well with ϕ estimated from spikes alone by the DSR method (Fig. 3c). The two estimates of ϕ differed by only 23% on average. Moreover, our DSR method and the partitioning based on the subthreshold voltage attributed a similar fraction of the total spiking variability (measured by \({\,{\mbox{CV}}\,}_{{{\rm{raw}}}}^{2}\) of ISIs in real time) to the spiking irregularity (Fig. 3d, \(\phi /{\,{\mbox{CV}}\,}_{{{\rm{raw}}}}^{2}\): 0.26 ± 0.02 for the DSR method, 0.27 ± 0.04 for the subthreshold voltage method, mean ± std across neurons). These results validate the theoretical assumptions of the DSR framework and confirm the accuracy of the DSR estimation method.

The diversity of spiking irregularity across neurons and cortical areas

Equipped with the reliable method for partitioning spiking variability, we asked how spiking irregularity ϕ varies across neurons and cortical areas. We analyzed spikes recorded from areas spanning different stages of cortical hierarchy: from sensory (visual area V4, 282 neurons54), to the association (lateral intraparietal area, LIP, 61 neurons55), and premotor regions (dorsal premotor cortex, PMd, 343 neurons56), in monkeys performing behavioral tasks (Methods). The spiking irregularity varied systematically across these areas (Fig. 4a). On average, the spiking irregularity was slightly super-Poisson in V4 (ϕ = 1.22 ± 0.03, mean ± std across neurons) and became sub-Poisson and more regular in LIP (ϕ = 0.68 ± 0.04) and PMd (ϕ = 0.51 ± 0.02). In addition, the diversity of spiking irregularity across neurons within each area also systematically decreased from V4 to LIP to PMd (Fig. 4a, standard deviation of ϕ across neurons: 0.45 in V4, 0.34 in LIP, 0.21 in PMd). While nearly all PMd neurons had sub-Poisson spiking irregularity ϕ < 1, spiking of different V4 neurons ranged from clock-like regular (ϕ  ≈ 0) to highly irregular (ϕ > 2). Since our V4 and PMd recordings similarly sampled all cortical layers, these differences likely reflect differences in the circuitry and functional specialization of neurons in these areas. These results reject the assumption that the spike generation process is Poisson-like for most cortical neurons57,58, which is a common assumption in statistical models of neural dynamics on single trials11,19,21,59,60. Our results agree with the observation that spiking variability (FF) of many parietal10,26,27 and PMd neurons27,28 is sub-Poisson, whereas it is super-Poisson in visual cortical areas11,26,27,34. The advance of our results over previous observations is that the spiking irregularity ϕ reflects solely the neuron’s renewal function and is unaffected by the firing rate fluctuations.

Fig. 4: The diversity of spiking irregularity across neurons and cortical areas.
figure 4

a Distributions of spiking irregularity ϕ across neurons in cortical areas V4 (red), LIP (green), and PMd (blue). Triangles mark the mean ϕ across neurons in each area. The probability densities are computed using a Gaussian kernel density estimator (with kernel widths of 0.15, 0.15, and 0.08 for V4, LIP, and PMd, respectively). Shading indicates the standard deviation across 100 bootstrap samples obtained by resampling neurons. b Spiking irregularity ϕ (x-axis) versus mean firing rate (y-axis) for each neuron (dots) in V4 (left), LIP (center), and PMd (right). The lines indicate 0.75 quantiles of ϕ (red vertical line) and firing rate (black horizontal line). In all areas, among 25% neurons with the highest firing rates (above the black line), only a few neurons had ϕ within the top 25% (to the right of the red line). Most PMd neurons with low-to-moderate firing rates (below the black line) had low ϕ (to the left of the red line), likely due to prominent beta-band synchronization in this area. Source data are provided as a Source data file.

The diversity of ϕ across neurons in each area raises a question about the relationship between ϕ and the mean firing rate of a neuron. In all areas, neurons with high-firing rates had low spiking irregularity: among the 25% neurons with the highest firing rates, only a few neurons had ϕ within the top 25% (Fig. 4b). Low spiking irregularity may allow these neurons to transmit signals with reduced noise to other brain regions26. In particular, neurons with high FF and low ϕ may transmit high-fidelity information about dynamically changing firing rate on single trials. Overall, the mean firing rate and ϕ were negatively correlated in V4 (Pearson correlation coefficient ρ = −0.27, p = 4  10−6, n = 282) and LIP (ρ = −0.34, p = 7  10−3, n = 61), and not significantly correlated in PMd (ρ = 0.03, p = 0.62, n = 343). This correlation did not arise solely from the refractory period effects of high-firing rate neurons (Supplementary Note 1.5). The lack of correlation in PMd resulted from the prevalence of neurons with low-to-moderate firing rates and low ϕ, likely due to prominent beta-band synchronization in this area61,62. The observed inverse relationship between the mean firing rate and spiking irregularity is nontrivial because ϕ in our DSR framework is independent of the firing rate. These findings suggest that spiking irregularity may serve varying functions in different cortical areas, challenging the idea that it stems solely from inherently irreducible sources of noise63.

Partial invariance of spiking irregularity

Our finding that ϕ ≠ 1 for most cortical neurons suggests incorporating non-Poisson spiking irregularity into methods for estimating firing rates on single trials. An important consideration for developing such methods is whether the spiking irregularity of a neuron changes dynamically or is approximately constant, invariant to changes in behavioral and cognitive state. The total spiking variability, usually measured by the FF, changes dynamically in many conditions (e.g., during stimulus onset33 or selective attention58), but these changes may reflect modulations of either the firing rate, spiking irregularity48, or both. Therefore, we proceeded to examine whether ϕ of each neuron was invariant or changed across conditions and epochs of behavioral tasks, which correspond to different modes of computation associated with distinct operating regimes of network dynamics.

First, we compared ϕ of each V4 neuron across behavioral conditions in the spatial attention task performed by monkeys15. In this task, monkeys detected changes in a visual stimulus in the presence of three distractor stimuli and reported the change with an antisaccade response (Fig. 5a). In each trial, a cue indicated the stimulus that was most likely to change, which was thus the target of covert attention, and the stimulus opposite to the cue was the target of overt attention due to the antisaccade preparation. It is well established that variability of neural responses measured by the FF decreases during attention58,64, but whether the FF decrease results from a reduction in the firing-rate fluctuations, spiking irregularity, or both has not been tested. We compared ϕ estimated on trials when the monkeys directed their attention (either covert or overt) to the location of the neuron’s receptive field (RF) and on trials when they attended to locations outside the RF and found no significant differences in estimated ϕ across these attention conditions (Fig. 5b, p = 0.10, n = 237, two-sided Friedman test, Methods). This result indicates that spiking irregularity ϕ is invariant to the attentional modulation of the network state in V4, and that reduction in FF primarily reflects suppression of the firing-rate fluctuations. Thus, attention stabilizes firing rates over longer timescales without affecting the spiking irregularity of single neurons. This result suggests that attention enhances information transmitted through the firing rates rather than precise spike patterns and provides tight constraints for biophysical network models of attention.

Fig. 5: Dependence of spiking irregularity on the behavioral and cognitive state.
figure 5

a Attention task performed by monkeys during V4 recordings. Monkeys detected an orientation change in one of four peripheral grating stimuli, while an attention cue (short white line) indicated which stimulus was likely to change. Monkeys reported the change with a saccade to the stimulus opposite to the change (black arrow). The cued stimulus was the target of covert attention, while the stimulus opposite to the cue was the target of overt attention. The dashed circle indicates the receptive field location of the recorded neurons (V4 RF). b Task conditions (left). On cue-RF trials (green frame), the attended stimulus (yellow circle) was in the RF of recorded neurons. On cue-opposite trials (orange frame), antisaccades were directed to the RF stimulus (black arrow). On cue-orthogonal trials (pink and purple frames), neither the attended stimulus nor the saccade target was in the RF. The scatter plots show estimated ϕ for every pair of task conditions. Each dot represents one V4 neuron. c Decision-making task performed by monkeys during PMd recordings. Monkeys discriminated the dominant color in a checkerboard stimulus composed of red and green squares and reported their choice by touching the corresponding target (upper panels). Task conditions varied by the response side indicated by the stimulus (left versus right) and stimulus difficulty controlled by seven coherence levels c1 through c7 (lower panels). d The scatter plots show estimated ϕ for each pair of coherence levels, combined across chosen sides. Each dot is one PMd neuron. e The scatter plot shows estimated ϕ for right versus left choice trials, combined across coherence levels. Each dot is one PMd neuron. f Histogram across PMd neurons of the modulation index of ϕ estimated during the fixation (when targets were visible) and decision epochs of the task. The triangle marks the median modulation index, which is significantly greater than zero (***p = 3  10−7, n = 262, one-sided Wilcoxon signed-rank test). Source data are provided as a Source data file.

Next, we compared ϕ of each PMd neuron across behavioral conditions in the decision-making task performed by monkeys56. In this task, monkeys discriminated the dominant color in a static checkerboard stimulus composed of red and green squares and reported their choice by touching the corresponding left or right target (Fig. 5c). The proportion of the same-color squares in the checkerboard (coherence) varied across trials to control the stimulus difficulty, with seven coherence levels used for each left and right response side, resulting in 14 stimulus conditions total. We estimated ϕ in each of these conditions separately during the decision epoch of the task after the checkerboard onset and found no significant differences in ϕ across stimulus conditions (Fig. 5d, p = 0.11, n = 272, two-sided Friedman test, Methods). Thus, while PMd dynamics change substantially across conditions in correlation with the chosen side and reaction time9,56, the spiking irregularity ϕ was invariant to these changes. Similarly, spiking irregularity ϕ of LIP neurons during the decision epoch was invariant between two-choice and four-choice decision-making tasks (p = 0.84, n = 60, two-sided Wilcoxon signed-rank test). Thus, in different tasks and cortical areas, we found that the spiking irregularity was invariant across behavioral conditions during the same task epoch.

Finally, we tested whether ϕ of a neuron was the same between different task epochs that engage distinct computations. For PMd neurons, we compared ϕ during the decision epoch while monkeys were making their choice and during the pre-stimulus fixation period while monkeys held their hand still on a fixation target waiting for the stimulus to appear (Fig. 5c, Methods). The spiking irregularity ϕ was significantly greater during the decision epoch than during pre-stimulus fixation (Fig. 5f, modulation index 2(ϕdecisionϕfixation)/(ϕdecision + ϕfixation) = 0.16, p = 0.003, n = 262, two-sided Wilcoxon signed-rank test). The lower spiking irregularity ϕfixation likely reflects elevated beta-band synchronization during the fixation period, characteristic of the movement preparatory activity in PMd61,62. In contrast, the spiking irregularity of LIP neurons was not significantly different between the fixation and decision epochs of the task (p = 0.84, n = 45, two-sided Wilcoxon signed-rank test). Our results show that the spiking irregularity ϕ of a neuron is invariant to alterations of the network state in many conditions, such as different attention states or decisions, but ϕ can also change across different modes of network operation associated with distinct computations.

Network mechanisms of spiking irregularity

We finally asked what biophysical mechanisms could explain our findings that cortical neurons have diverse spiking irregularity, which is inversely related to the mean firing rate and remains nearly constant across many conditions for individual neurons. To test possible mechanisms, we used a spiking recurrent neural network model that accounts for several characteristics of spiking variability in the visual cortex and its attentional modulation35 (Fig. 6a, Methods). The model consists of a three-layer hierarchy (thalamus-V1-MT), in which the V1 and MT layers are two-dimensional balanced networks of excitatory and inhibitory neurons with spatially ordered recurrent connectivity. The connection probability within recurrent layers falls off with distance between neurons, mimicking lateral connectivity in the visual cortex65. The V1 layer receives an excitatory feedforward input from the thalamus layer modeled as Poisson neurons firing at a uniform rate, while excitatory neurons in V1 project to the MT layer. Recurrent interactions in the network generate turbulent dynamics, which give rise to low-dimensional population-wide fluctuations in spiking activity35.

Fig. 6: Network mechanisms of diverse spiking irregularity and its invariance during attention.
figure 6

a The model comprises three layers representing the thalamus, V1, and MT, implemented as spatially organized spiking networks (upper panel). Layer 1 consists of 2500 excitatory Poisson neurons firing at a uniform rate of 10 Hz. Layers 2 and 3 are recurrently coupled balanced networks, each containing 40,000 excitatory and 10,000 inhibitory neurons. Layer 2 receives feedforward excitatory input from Layer 1, and the excitatory neurons in Layer 2 project to Layer 3. Within Layers 2 and 3, neurons form spatially structured recurrent connections, with connection probability decaying with distance according to a Gaussian profile (lower panel; color intensity indicates connection probability). b Distributions of spiking irregularity (ϕ) for excitatory (blue) and inhibitory (red) neurons in Layer 2 (solid lines) and Layer 3 (dashed lines). The diversity of spiking irregularity across Layer 2 neurons aligns with experimental observations (cf. Fig. 4). Excitatory neurons in Layer 3 exhibit highly regular spiking, while inhibitory neurons display a broad range of spiking irregularity. c Spiking irregularity of neurons in Layer 2 decreases with the firing rate (blue dots—excitatory neurons, red dots—inhibitory neurons, lines—linear regression). d Spiking irregularity of neurons in Layer 2 decreases with the balance in the total number of excitatory and inhibitory connections received by a neuron nfe + neenie. e Attentional modulation of FF, spiking irregularity ϕ, and firing rate variability Var(λT) in Layer 3. The FF modulation index (MI) is defined as (FFattended−FFunattended)/(FFattended + FFunattended), and similarly for spiking irregularity and firing rate variability. FF is significantly reduced during attention (left, ***p < 10−10, n = 35, 944, one-sided t-test), driven by an attention-mediated reduction in firing rate variability (right, ***p < 10−10), while spiking irregularity remains unchanged (center, ns, p = 0.51). The dashed red line marks zero, and the blue triangle indicates the mean of the distribution. Source data are provided as a Source data file.

Since all neurons in the model have the same deterministic voltage threshold for spike generation, the spiking irregularity arises entirely from fluctuating inputs to a neuron shaped by the recurrent network dynamics. Similar to our cortical data, the spiking irregularity ϕ in the V1 layer varied broadly across neurons in the network model (Fig. 6b). The diversity of ϕ across neurons was not due to heterogeneous single-cell properties, since all excitatory and inhibitory neurons in the model had identical parameters, but due to differences in how neurons were embedded in the network. The connections between neurons in the model are made randomly with distance-dependent probability. Thus, by chance, some neurons receive more excitation than inhibition and reside near or even above the firing threshold and therefore spike regularly, while others receive stronger net inhibition, reside far from the firing threshold, and fire irregularly, driven by large fluctuations. Consistent with this mechanism, ϕ was negatively correlated with the average firing rate of a neuron (Fig. 6c) and with the difference between the number of excitatory and inhibitory connections it receives (Fig. 6d). The synaptic input balance is not the sole source of diverse spiking irregularity, as heterogeneous ϕ also arises from chaotic dynamics in a balanced random network model, where each neuron receives the same number of excitatory and inhibitory connections36 (Supplementary Note 1.6, Supplementary Fig. 5). Hence, the spiking irregularity ϕ arises from both the recurrent network dynamics and neuron’s embedding within this network. These results show that probabilistic spike generation in doubly stochastic point process models is compatible with a deterministic voltage threshold for spike firing in single neurons.

The distribution of ϕ in the spatially ordered balanced network model peaked below one, similar to that for LIP and PMd but not matching the greater diversity and larger values of ϕ in V4 (Fig. 6b). This observation suggests that other features of the V4 circuitry not included in this model may contribute to the spiking irregularity, such as a distinct dynamical regime (Supplementary Note 1.6, Supplementary Fig. 5), heterogeneous single-cell properties, cell types, and non-Poisson input. In addition, changes in the network state can dynamically modulate both the average firing rate and spiking irregularity via multiple biophysical mechanisms, such as inputs to excitatory or inhibitory neurons or changes in the membrane conductance (Supplementary Note 1.7, Supplementary Fig. 6), providing a possible mechanism for the change in ϕ observed in PMd (Fig. 5f). Thus, measurements of spiking irregularity in experimental data can provide tighter constraints on biophysical neural circuit models.

Finally, it has been shown previously that this spatial network model accounts for the reduction in FF during attention35, and we tested whether the FF reduction in the model resulted from changes in spiking irregularity or firing rate fluctuations. We model top-down attentional modulation as a static depolarizing input current to MT inhibitory neurons35 (0.2 mV/ms in the unattended state, 0.4 mV/ms in the attended state, Methods). While FF in the model was significantly reduced during attention (Fig. 6e, FF modulation index MIFF = −0.06, p < 10−10, n = 35, 944, two-sided t-test), the spiking irregularity remained unchanged (Fig. 6e, ϕ modulation index MIϕ = −0.001, p = 0.51, n = 35, 944, two-sided t-test), consistent with our experimental observations (Fig. 5b). Accordingly, the source of FF reduction in the model was a decrease in firing rate variability, which we estimated using Eq. (5) (Fig. 6e, Var(λT) modulation index MIVar(λT) = −0.12, p < 10−10, n = 35, 944, two-sided t-test). Thus, a reduction in FF in the model resulted from a decrease in firing rate fluctuations while spiking irregularity remained unchanged, consistent with our observations in experimental data, therefore supporting the proposed circuit mechanism of attentional modulation. Our method is uniquely suited to detect this dissociation, providing tighter constraints on biophysical circuit models of attention.

Discussion

We introduced a DSR process, a mathematical framework for partitioning the total spiking variability of neurons into firing rate fluctuations and spiking irregularity. The standard model used to relate dynamically changing firing rates to spikes is an inhomogeneous Poisson process. However, the inhomogeneous Poisson process can only produce a fixed spiking irregularity, corresponding to ϕ = 1 in our DSR framework. Hence, it cannot account for diverse spiking statistics of neurons, e.g., regular spiking with FF smaller than one. On the other hand, a stationary renewal point process37 can generate spike trains with a fixed firing rate and any irregularity, from nearly periodic to super-Poisson. A previously proposed non-stationary extension of the renewal process incorporated a time-dependent firing rate but did not account for trial-to-trial firing rate fluctuations39,66,67. Our DSR process generalizes these previous models by encompassing both stochastic firing rate fluctuations and a broad spectrum of spiking irregularity. A subset of stationary renewal processes can be expressed as inhomogeneous Poisson processes68,69,70, leading to potential ambiguity in assigning variability to the firing rate versus spiking irregularity29. We resolve this ambiguity by imposing a minimal set of constraints: the renewal property in the operational time and smoothness of the firing rate over short timescales. With these constraints, our DSR framework enables unambiguous partitioning of variability.

We validated the accuracy of our estimation method on synthetic data with known ground truth. In contrast, we find that previous methods for partitioning spiking variability are less reliable either due to an inability to account for fluctuating firing rate39 or due to assumptions that do not always hold true10,28,31. This latter observation agrees with the previous work showing that any partitioning of variability is ambiguous without an underlying mathematical model29.

Furthermore, we confirmed that our DSR model aligns with the biophysical properties of neural circuits. This connection had not been tested despite the widespread use of doubly stochastic point processes for modeling spiking activity. In fact, the reliable spiking of cortical neurons in response to time-varying inputs71 may seem to suggest that stochastic spike generation models are incompatible with circuit biophysics. We show that the spiking irregularity can arise from recurrent dynamics and reflect a neuron’s embedding within the network, even when individual neurons have a deterministic voltage threshold for spike generation. Our validation of the DSR model using intracellular voltage recordings and spiking network simulations justifies the widespread use of doubly stochastic models in single-trial spike-train analysis and establishes their connection with the underlying biophysical processes.

We applied our method to survey the spiking irregularity of neurons across sensory, association, and premotor cortical areas. We found that neurons within each area showed a wide range of spiking irregularity, with the greatest diversity in area V4. The diversity of spiking irregularity may arise from several possible sources: heterogeneity in neurons’ morphology, cell type, or differences in how a neuron is embedded in the surrounding network. Our simulations of the spiking neural network models show that diverse spiking irregularity can arise from variations in the balance of excitatory and inhibitory inputs across neurons as well as from recurrent network dynamics. We further found that the average spiking irregularity decreased systematically from V4 to LIP to PMd, consistent with previous observations that responses of parietal and PMd neurons are more regular than Poisson26,27. These previous studies used various metrics to quantify spiking irregularity in data, but lacked a generative model. Therefore, it is difficult to assess the accuracy of these methods, leaving uncertainty about the reliability of derived conclusions. Our work overcomes these limitations by introducing a mathematical definition of spiking irregularity as a parameter within a generative model, which enables us to verify the accuracy of our estimation method and opens the possibility of integrating spiking irregularity into models of single-trial neural dynamics beyond the standard Poisson assumption. Our results confirm that spiking irregularity decreases along the cortical hierarchy, suggesting it may be related to the functional specialization of cortical areas.

While our results and previous studies26,27 show that spiking irregularity decreases from visual to association to motor cortical areas, intrinsic neural timescales systematically increase along the cortical hierarchy72,73,74,75. Intrinsic timescales are defined by the exponential decay rate of the autocorrelation function of spiking activity and typically range from tens to several hundred milliseconds, reflecting primarily slow firing-rate dynamics rather than spiking irregularity. Thus, firing rate timescales and spiking irregularity follow inverse gradients that align with the functional specialization of cortical areas. A precise Bayesian estimation method76 revealed that spiking activity in the primate visual cortex unfolds on at least two timescales: a fast  ~5 ms timescale and a slow  ~100 ms timescale77. The millisecond range of the fast timescale may partly reflect spiking irregularity. Moreover, the slow—but not the fast—timescale increased during selective attention77, consistent with our observation that spiking irregularity remains invariant while firing rate fluctuations are stabilized during attention. Together, these findings suggest that spiking irregularity and intrinsic timescales reflect distinct yet complementary features of neural dynamics, each influencing how cortical areas process information over time.

We tested whether spiking irregularity is a neuron-specific constant, invariant to changes in network dynamics due to variations in behavioral and cognitive state. Indeed, the spiking irregularity was invariant in many cases, such as across different attention states or decision difficulties. However, we also found that the spiking irregularity of PMd neurons changed between different epochs of the task48, which indicates that spiking irregularity can also change as a function of the network state. Our spiking network simulations show that depolarization of excitatory neurons combined with an increase in their membrane conductance can simultaneously increase the spiking irregularity and firing rate, similar to the modulation we observed in PMd. The effective membrane conductance increases with depolarization in conductance-based models of spiking neurons78,79, which are therefore especially useful for studying mechanisms modulating spiking irregularity.

Changes in neural variability can furnish insights into mechanisms of diverse brain functions10,11,80. For example, FF decreases in many brain regions during sensory stimulus onset33, motor preparation3,81, and selective attention58,82, which has been interpreted as a mechanism for enhancing the fidelity of neural representations. However, FF mixes contributions from both firing rate fluctuations and spiking irregularity, making such a mechanistic interpretation more challenging48. Going beyond FF, some studies partitioned neural variability into firing rate fluctuations and spiking irregularity10,31,48. This partitioning approach revealed that firing rate variability on longer timescales increases through the decision period, while assuming that spiking irregularity is fixed across different task epochs10. In contrast, we found that although spiking irregularity was invariant in many cases, it changed across different epochs of the decision-making task in PMd. Thus, our DSR framework and estimation method enable more nuanced analyses of neural variability, opening a possibility to identify how changes in the firing rate variability and spiking irregularity independently contribute to neural computations.

While our DSR model assumes that ISIs are independent in operational time, it can generate serial ISI correlations in real time through temporally correlated instantaneous firing rates. Such serial correlations between ISIs are commonly observed in data83,84 and can arise from either correlated input or firing rate adaptation in mechanistic integrate-and-fire models85,86,87,88. Our DSR model can incorporate ISI correlations in real time via temporally correlated firing rate fluctuations, while maintaining approximately uncorrelated ISIs in operational time (Supplementary Note 1.8). Thus, our DSR framework can be broadly applied to model spiking responses.

While an inhomogeneous Poisson process is widely used in methods for inferring latent neural dynamics on single trials18,19,20,21,22,23,24,25, our results highlight the need for incorporating non-Poisson spiking irregularity into these methods. One approach for incorporating non-Poisson spiking is to augment the inhomogeneous Poisson process with the instantaneous firing rate that depends on the spike history60,89,90. Our DSR framework offers two additional approaches for including non-Poisson spiking irregularity into latent variable models. First, we can estimate ϕ from data with our DSR method and then model the spike generation from the firing rate as a renewal point process with g() being a gamma distribution uniquely defined by the estimated ϕ. Second, we can simultaneously infer the distribution g() and latent dynamics91,92,93. In our framework, the inference of g() amounts to the estimation of a single parameter ϕ per neuron and thus is maximally parameter-efficient. Finally, our DSR framework provides an accurate metric for evaluating the goodness of fit of latent variable models, whereas Poisson likelihood may produce misleading results when applied to spikes with non-Poisson statistics30.

Together, our results uncover the great diversity in the spiking irregularity across neurons, cortical areas, and cognitive states, which cannot be captured with the conventional inhomogeneous Poisson model. Our theoretical framework and estimation method provide a flexible tool for quantifying spiking variability to investigate its role in neural computation and the underlying biophysical mechanisms. Our DSR point process provides a flexible model to capture the broad spectrum of spiking irregularity of cortical neurons and improve the precision of methods for estimating latent dynamics on single trials.

Methods

Partitioning variability

For a pair of random variables X and Y, the law of total variance (LOTV) decomposes the variance of Y into two parts: Var(Y) = E[Var(YX)] + Var(E[YX]). We assume that λ(t) is approximately constant within a time bin of size T, then choosing X = λ and Y = NT, we obtain Eq. (3): Var(NT) = E[Var(NTλ)] + Var(E[NTλ]). Next, we calculate E[NTλ] and Var(NTλ) for a DSR point process {g(), λ(t)}.

First, we consider a simple renewal point process fully defined by its ISI probability density f(x), meaning that after generating a spike, the probability of the next spike occurring within the interval [xx + dx] is f(x)dx. Denoting the first three central moments of f(x) by μ, σ2, and μ3, we show that the mean and variance of the spike count NT in a bin with the size T are (Theorem 1 in Supplementary Note 1.1):

$$E({{{\boldsymbol{N}}}}_{T})=\frac{T}{\mu },$$
(9)
$$\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})=\frac{{\sigma }^{2}}{{\mu }^{3}}T+\frac{{\sigma }^{4}}{2{\mu }^{4}}+\frac{1}{6}-\frac{{\mu }_{3}}{3{\mu }^{3}}+{{\mathcal{O}}}({T}^{-1}).$$
(10)

Next, we consider a DSR point process {g(), λ(t)}. Assuming λ(t) changes on a timescale longer than T, we can consider λ to be approximately constant within a bin. Then, within a single bin, the spike-generating process is a renewal point process fully defined by its ISI probability density, which we denote by fλ since it depends on the value of λ in the bin. We derive that E[NTλ] = λT (Theorem 2 in Supplementary Note 1.1), which we substitute in Eq. (3) to get Var(NT) = E[Var(NTλ)] + Var(λT). We further derive that for a moderately large bin size T > 1/E[λ], we can express E[Var(NTλ)] via the first three moments of the probability density g() as stated in Eq. (4) (Theorem 4 in Supplementary Note 1.1). With these results, we obtain

$$\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})=\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}}T)+{\left(\frac{{\sigma }_{g}}{{\mu }_{g}}\right)}^{2}\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]+\frac{1}{6}+\frac{1}{2}{\left(\frac{{\sigma }_{g}}{{\mu }_{g}}\right)}^{4}-\frac{1}{3}\frac{{{\mu }_{3}}_{g}}{{\mu }_{g}^{3}}+{{\mathcal{O}}}({T}^{-1}).$$
(11)

To simplify the partitioning equation, we assume that g() belongs to the two-parameter family of continuous probability distributions and is uniquely determined by its first two moments μg and \({{\sigma }_{g}}^{2}\). Since μg = 1 s, the probability density g() is uniquely determined by \(\phi={\sigma }_{g}^{2}/{\mu }_{g}^{2}\). Thus, the third central moment is a function of ϕ, which we denote by μ3 = ψ(ϕ). With this parametrization, we obtain our general partitioning equation:

$$\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})=\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}}T)+\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]+\frac{1}{6}+\frac{1}{2}{\phi }^{2}-\frac{1}{3}\psi (\phi )+{{\mathcal{O}}}({T}^{-1}).$$
(12)

As a special case of the two-parameter family, we consider g() to be a gamma distribution38,42,43

$$g(\tau )=\frac{1}{\Gamma (k){\theta }^{k}}{\tau }^{k-1}{e}^{-\frac{\tau }{\theta }},$$
(13)

which we can reparameterize in terms of ϕ:

$$g(\tau )={(\Gamma ({\phi }^{-1}))}^{-1}{\phi }^{-\frac{1}{\phi }}{\tau }^{\frac{1-\phi }{\phi }}{e}^{-\frac{\tau }{\phi }}.$$
(14)

In the special case of ϕ = 1, the ISI distribution reduces to the exponential distribution g(τ) = eτ, which corresponds to the Poisson spiking process. We can compute the central moments of the gamma distribution g(τ) in terms of ϕ, specifically, μg = kθ = 1, \({\sigma }_{g}^{2}=k{\theta }^{2}=\phi\), and \({{\mu }_{3}}_{g}=2k{\theta }^{3}=2{\phi }^{2}\). Substituting these expressions in Eq. (11), we obtain our final partitioning equation Eq. (5).

Estimation methods for ϕ

We develop a DSR method for estimating ϕ from spike data. The data consist of spike times for multiple trials. We choose a bin size T and estimate E[NT] and Var(NT) as the mean and variance of spike counts across trials in bins [tt + T], where t is a time within a trial. Besides ϕ, the only other unknown term in the partitioning equation Eq. (12) is Var(λT). Using the fact that Var(λT) = T2Var(λ) for every bin, we express this unknown part through other terms

$${T}^{2}\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}})\approx \,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})-\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]-\frac{1}{6}-\frac{1}{2}{\phi }^{2}+\frac{1}{3}\psi (\phi ).$$
(15)

In this equation, E[NT], Var(NT), and T2Var(λ) are all functions of the bin size T. Thus, by changing the bin size, we can obtain a system of two equations for two bins [tt + T] and \([t,t+\widetilde{T}]\):

$$\left\{\begin{array}{l}{T}^{2}\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}})\approx \,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})-\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]-\frac{1}{6}-\frac{1}{2}{\phi }^{2}+\frac{1}{3}\psi (\phi ),\quad \\ {\widetilde{T}}^{2}\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}})\approx \,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{\widetilde{T}})-\phi \,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{\widetilde{T}}]-\frac{1}{6}-\frac{1}{2}{\phi }^{2}+\frac{1}{3}\psi (\phi ).\quad \end{array}\right.$$
(16)

Denoting \(\alpha=\widetilde{T}/T > 1\), we eliminate the unknown Var(λ) in this system of equations and obtain a single equation in which ϕ is the only unknown variable:

$$ ({\alpha }^{2}-1)\left(\frac{\psi ({\phi }_{{{\rm{DSR}}}})}{3}-\frac{{\phi }_{{{\rm{DSR}}}}^{2}}{2}\right)-({\alpha }^{2}\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}] -\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{\widetilde{T}}]){\phi }_{{{\rm{DSR}}}} \\ +{\alpha }^{2}\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})-\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{\widetilde{T}})-\frac{{\alpha }^{2}-1}{6}=0.$$
(17)

Here we denote the solution of this equation by ϕDSR, which is an estimate of ϕ with the DSR method. The coefficients in this equation include four terms \(\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}],\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{\widetilde{T}}],\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})\) and \(\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{\widetilde{T}})\) that can be directly estimated from data with a moderate number of trials. In the case when g() is the gamma distribution, \({{\mu }_{3}}_{g}=\psi (\phi )=2{\phi }^{2}\), and we obtain a quadratic equation for ϕDSR:

$$ \frac{{\alpha }^{2}-1}{6}{\phi }_{{{\rm{DSR}}}}^{2}-({\alpha }^{2}\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{T}]-\,{{\mbox{E}}}\,[{{{\boldsymbol{N}}}}_{\widetilde{T}}]){\phi }_{{{\rm{DSR}}}}+{\alpha }^{2}\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{T})\\ -\,{{\mbox{Var}}}\,({{{\boldsymbol{N}}}}_{\widetilde{T}})-\frac{{\alpha }^{2}-1}{6}=0.$$
(18)

To derive this equation, we assumed that for every realization of λ(t), the firing rate λ(t) changes slowly relative to the timescales T and \(\widetilde{T}=\alpha T\). Thus, we assume that λ(t) is constant within the bin \(\widetilde{T}\), hence, α cannot be too large. However, if α is very close to 1, the two equations for different bin sizes T and \(\widetilde{T}\) are nearly identical, leading to a large error in estimated ϕ. Through simulations, we found that α = 2 produces accurate results for wide ranges of parameters, and therefore, we set α = 2 in all analyses. Substituting α = 2 in Eq. (18), we obtain the final estimation equation Eq. (6).

We compared the accuracy of our estimation method with two previously proposed methods. The first method, the DTR39, assumes that the time-dependent firing rate is the same on every trial, deterministically locked to a trial event. The method then estimates the firing rate as the average spike count in a bin across trials: \(\hat{\lambda }({t}_{i})=1/K{\sum }_{k=1}^{K}{N}_{T,i}^{k}\), where \({N}_{T,i}^{k}\) is the number of spikes in the bin \([{t}_{i}-\frac{T}{2},{t}_{i}+\frac{T}{2}]\) in the kth trial, and K is the number of trials. If the ground-truth firing rate is the same on each trial, \({\phi }_{{{\rm{DTR}}}}\) converges to the ground-truth ϕ for large trial number: \({\lim }_{{{\rm{Var}}}[{{\boldsymbol{\lambda }}}(t)]\to 0,K\to \infty }{\phi }_{{{\rm{DTR}}}}=\phi\) (Supplementary Note 1.3). However, in the presence of trial-to-trial firing rate fluctuations, we derive that the estimation error of this method is \({\phi }_{{{\rm{DTR}}}}-\phi=(A-1)(\phi+1)\), where A is a monotonically increasing function of the average firing rate variance (Supplementary Note 1.3).

The second family of methods10,31,47 assumes that the point process variance is proportional to the mean spike count E[Var(NTλ)] = ϕE[NT], where ϕ is a neuron-specific constant (Eq. 7). This assumption was motivated by the renewal theory37, which states that for a stationary renewal process with a constant firing rate, the FF converges to the squared coefficient of variation of ISIs (CV2) in the limit of an infinite bin size: \({\lim }_{T\to \infty }\,{{\mbox{FF}}}={{\mbox{CV}}}^{2}\). However, this relation strictly applies only in the limit of infinite bin size (T → ) and a constant firing rate94, and for a finite bin size, the FF of a renewal process depends on both the bin size and firing rate. Using Eq. (4), we derive that for a renewal process with a constant firing rate in a finite bin size, it holds \(\,{{\mbox{FF}}}={{\mbox{CV}}}^{2}+\frac{{c}_{1}}{{{\mbox{E}}}\,({{{\boldsymbol{N}}}}_{T})}+\frac{1}{T}\cdot \frac{{c}_{2}}{\,{{\mbox{E}}}\,({{{\boldsymbol{N}}}}_{T})}+{{\mathcal{O}}}({T}^{-2})\), where the coefficients c1 and c2 depend on g(). This relation shows that ϕ, defined via Eq. (7) with a finite bin size, is not a constant characterizing the renewal process, but a function of the bin size and firing rate. Therefore, any method for estimating ϕ using Eq. (7) with a finite bin size will produce results that do not uniquely characterize the spiking irregularity of a neuron, but depend on nuisance parameters such as the bin size and firing rate. While it is possible to accurately estimate ϕ (i.e., CV2 of ISIs of a renewal process) by considering the asymptotic behavior of FF(T) for large bin sizes48, this asymptotic method requires the firing rate to be constant over time bins T longer than fast timescales involved in behavior, e.g., decision-making.

Based on Eq. (7), the MR method10 estimates ϕ as the minimum FF across all time bins. We show that for DSR processes, the estimation error of this method is (Supplementary Note 1.4)

$${\phi }_{{{\rm{MR}}}}-\phi \approx \mathop{\min }_{t}\left\{T\frac{\,{{\mbox{Var}}}\,({{\boldsymbol{\lambda }}}(t))}{\,{{\mbox{E}}}\,[{{\boldsymbol{\lambda }}}(t)]}+\frac{1}{T}\cdot \frac{\frac{1}{6}+\frac{1}{2}{(\phi )}^{2}-\frac{1}{3}\psi (\phi )}{\,{{\mbox{E}}}\,[{{\boldsymbol{\lambda }}}(t)]}\right\}.$$
(19)

Other methods were also proposed for estimating ϕ in Eq. (7) with better accuracy than the MR method under the assumption that the firing rate obeys an unbounded drift-diffusion process31,47. However, these methods require a priori knowledge of the firing rate dynamics. In addition, they start from the same premise, Eq. (7), as the MR method, and therefore have similar limitations for a finite bin size.

Synthetic data generation

For the drift-diffusion model with sticky boundaries (Figs. 1 and 2), we generate the firing rate from

$$\frac{d{{\boldsymbol{\lambda }}}(t)}{dt}=\left\{\begin{array}{ll}0,\quad \hfill &{{\boldsymbol{\lambda }}}(t)={b}_{l},\hfill \\ \nu+\sqrt{2D}{{\boldsymbol{\xi }}}(t),\quad &{b}_{l} < {{\boldsymbol{\lambda }}}(t) < {b}_{u},\hfill \\ 0,\quad \hfill &{{\boldsymbol{\lambda }}}(t)={b}_{u}.\hfill \end{array}\right.$$
(20)

Here, ν is the drift, ξ(t) is a white Gaussian noise 〈ξ(t)〉 = 0, \(\langle {{\boldsymbol{\xi }}}(t){{\boldsymbol{\xi }}}({t}^{{\prime} })\rangle=\delta (t-{t}^{{\prime} })\), D is the diffusion coefficient, and bl and bu are the lower and upper boundary values, respectively. We use bl = 1 Hz, bu = 20 Hz, ν = 0 Hz  ms−1, and the initial firing rate value λ = 10 Hz (Fig. 1) and bl = 1 Hz, bu = 60 Hz, ν = 0.0138 Hz  ms−1, and the initial firing rate value λ = 30 Hz (Fig. 2).

Estimation of ϕ from spike data

To estimate ϕ with the DSR method in synthetic and experimental data, we found ϕ as a solution of Eq. (6) for each time point ti within an analysis window with two time bins [titi + T] and [titi + 2T]. We then obtained the final ϕ by averaging the results across all time points ti within the analysis window. To ensure that Eq. (5) holds, we require T > 1/E[λ]. Since we assume the firing rate is constant within a bin, we also need to choose a bin size as small as possible. To satisfy both conditions, we set T = 2/E[λ] for each neuron in experimental and synthetic data, where E[λ] is the average firing rate of the neuron over the analysis period. We confirmed that our inference method is robust and largely insensitive to bin size across a wide range of values, and remains reliable even in the presence of occasional rapid changes in firing rate (Supplementary Note 1.2).

To estimate ϕ with the DTR method (Fig. 2), we estimated the trial-averaged firing rate using a 60 ms sliding window with 10 ms increments. We used this firing rate and Eq. (1) to map spike times from real time t to operational time \({t}^{{\prime} }\). We then computed ϕ as the squared coefficient of variation CV2 of ISIs in operational time. For the MR method (Fig. 2), we used Eq. (8) with a bin size of 60 ms.

Intracellular voltage recordings

We used a previously described dataset32 of whole-cell intracellular recordings of membrane potential from PV neurons in L2/3 of barrel cortex in awake head-restrained mice (six 5–10 week old female and male PV-IRES-Cre mice). All experiments were carried out in accordance with protocols approved by the Swiss Federal Veterinary Office (authorization VD1628). The sampling rate of the membrane potential measurements was 20 KHz. We only analyzed neurons with a trial-averaged firing rate of at least 20 Hz that had data for at least 10 trials of 20 s duration each. These criteria yielded eight neurons from five female mice.

We estimated the function that relates subthreshold membrane potential to instantaneous firing rate, similar to previous studies52,53. First, we removed action potentials from the voltage traces. We define the spike time as the time when the membrane potential crosses a  −30 mV threshold (the average membrane potential for cells is typically much lower at about  −60 mV). We then remove the segment of the voltage trace from 3 ms before to 5 ms after each spike time and linearly interpolate between these two points. Second, we segment the recorded voltage trace in Δt = 50 ms time bins. For each time bin i, we compute the average membrane potential Vi and the number of spikes Ni within that time bin. We plot Nit versus Vi for all time bins (blue dots in Fig. 3b). Next, we divide the voltage range [−68, −40] mV into ΔV = 1 mV bins ΔVk (k = 1, 2,, 28). We find the set of all data points falling within the kth bin Sk = [ i : Vi ΔVk], and compute the average firing rate

$${r}_{k}=\frac{1}{| {S}_{k}| \Delta t}\,{\sum}_{i\in {S}_{k}}{N}_{i}$$
(21)

and the average voltage \({V}_{k}=\frac{1}{2}\cdot (-68+k\Delta V)\) in that bin. Finally, we fit this average relationship with a spline to approximate the dependence of the instantaneous firing rate rk on subthreshold membrane potential Vk using the deterministic function f(v).

Neural recording data

We analyzed three experimental datasets described previously: recordings from area V4 during a spatial selective attention task54, recordings from the LIP during two-choice and four-choice decision-making tasks55, and recordings from the dorsal premotor cortex (PMd) during a decision-making task56. Experimental procedures for the V4 and PMd datasets were in accordance with the NIH Guide for the Care and Use of Laboratory Animals, the Society for Neuroscience Guidelines and Policies, and the Stanford Institutional Animal Care and Use Committee. Experimental procedures for the LIP dataset were in accordance with the NIH Guide for the Care and Use of Laboratory Animals and approved by the University of Washington Animal Care Committee.

V4 dataset

During recordings, the monkeys (G and B, Macaca mulatta, male, between 6 and 9 years old) detected orientation changes in one of the four peripheral grating stimuli while maintaining central fixation. Each trial started by fixating a central fixation dot on the screen, and after several hundred milliseconds (170 ms for monkey B and 333 ms for monkey G), four peripheral stimuli appeared. Following a 200–500 ms period, a central attention cue indicated the stimulus that was likely to change with  ~90% validity. The cue was a short line from a fixation dot pointing toward one of the four stimuli, randomly chosen on each trial with equal probability. After a variable interval (600–2200 ms), all four stimuli disappeared for a brief moment and reappeared. Monkeys were rewarded for correctly reporting the change in orientation of one of the stimuli (50% of trials) with an antisaccade to the location opposite to the change, or maintaining fixation if none of the orientations changed. Due to the anticipation of an antisaccade response, the cued stimulus was the target of covert attention, while the stimulus in a location opposite to the cue was the target of overt attention. In the cue-RF condition (Cue-RF), the cue pointed to the stimulus in the RFs of the recorded neurons (covert attention). In the cue-opposite condition (Cue-opp), the cue pointed to the stimulus opposite to the RFs (overt attention). The remaining two cue directions were cue-orthogonal conditions (Cue-orth-1 and Cue-orth-2), in which monkeys attended away from the RFs.

Recordings were performed in the visual area V4 with linear array microelectrodes inserted perpendicularly to the cortical layers. Data were amplified and recorded using the Omniplex system (Plexon). Arrays were placed such that the RFs of recorded neurons largely overlapped. Each array had 16 channels with 150 μm center-to-center spacing. After spike sorting and quality control, the dataset had 285 well-isolated single neurons from two monkeys.

LIP dataset

During recordings, two monkeys (Macaca mulatta, male, between 11 and 13 years old) performed the random dot motion discrimination task with either two or four choice alternatives. After a variable fixation period, two or four peripheral choice targets appeared to signal the direction alternatives on the trial. After a random delay (250–800 ms), dynamic random dot motion was displayed around the fixation point. The percentage of coherently moving dots on each trial controlled the task difficulty. Monkeys reported the net direction of motion in dynamic random dots by making a saccade to a peripheral choice target. The motion stimulus ended by the time of saccade initiation.

Single neuron activity was recorded with Alpha Omega electrodes introduced into the LIP area. Data were amplified and recorded using the Omniplex system (Plexon). Neurons were selected according to anatomical and physiological criteria, and all had spatially selective responses during the delay on the overlap and memory saccade tasks55. The dataset consists of extracellular recordings from 70 well-isolated neurons.

PMd dataset

During recordings, the monkeys (T and O, Macaca mulatta, male, between 6 and 9 years old) discriminated the dominant color in a static checkerboard stimulus composed of red and green squares and reported their choice by touching the corresponding target. At the start of each trial, a monkey touched a central target and fixated on a cross above the central target. After a short holding period (300–485 ms), red and green targets appeared on the left and right sides of the screen. The colors of each side were randomized on each trial. After another short delay (400–1000 ms), the checkerboard stimulus appeared on the screen at the fixation cross, and the monkey had to move its hand to the target matching the dominant color in the checkerboard. The difficulty of the task was parameterized by an unsigned stimulus coherence expressed as the absolute difference between the number of red (R) and green (G) squares, normalized by the total number of squares RG/(R + G). The checkerboard was 15 × 15 squares, which led to a total of 225 squares. The task was performed with 7 different unsigned coherence levels for monkey T and 8 levels for monkey O, and we analyzed the 7 overlapping coherence levels for two monkeys. Since PMd neurons are selective for the chosen side but not for color56, we divided the trials according to the side indicated by the stimulus (left or right) for each coherence level, resulting in 14 analyzed conditions in total.

Neural activity was recorded with a linear multi-contact electrode (U-probe) with 16 channels with 150 μm center-to-center spacing. After spike sorting and quality control, the dataset had 801 well-isolated single neurons from two monkeys.

Selection of units for the analyses

For all datasets, we selected units for our analyses based on two criteria: (i) we included conditions that had at least 20 trials, (ii) we included units that had at least 500 spikes in total across all trials of each condition within the analysis window.

For the V4 dataset, we estimated ϕ during the attention epoch of the trial in a window from 400 to 2300 ms aligned to the attention cue onset. 282 out of 285 single units passed the two criteria for all attention conditions combined and were used for quantifying the diversity of ϕ across neurons. 237 single units passed the two criteria in each of the four attention conditions and were used for comparing ϕ across attention conditions.

For the LIP dataset, we estimated ϕ in two trial epochs: fixation epoch, a window from  −200 to 0 ms aligned to the stimulus onset; and decision epoch, a window from 0 to the minimum of 1200 ms or reaction time aligned to the stimulus onset. Sixty-one out of 70 single units passed the two criteria during the decision epoch and were used to quantify the diversity of ϕ across neurons. 60 units passed the two criteria for both two-choice and four-choice decision-making tasks and were used for comparing ϕ across the tasks. Forty-five units passed the two criteria for both the fixation and decision epochs and were used for comparing ϕ across trial epochs.

For the PMd dataset, we estimated ϕ in two trial epochs: fixation epoch, a window from  −600 to 0 ms aligned to the stimulus onset; and decision epoch, a window from 0 to the minimum of 500 ms or reaction time aligned to the stimulus onset. 343 out of 801 single units passed the two criteria during the decision epoch and were used to quantify the diversity of ϕ across neurons. Two hundred seventy-two units passed the two criteria for each of 14 different task conditions (2 response sides times 7 stimulus difficulties) and were used for comparing ϕ across task conditions. Two hundred sixty-two units passed the two criteria for both the fixation and decision epochs and were used for comparing ϕ across two epochs.

Comparing spiking irregularity across task conditions

To test whether ϕ is a neuron-specific constant invariant to changes in behavioral and cognitive states, we estimated ϕ of each neuron separately in each task condition. We then used Demšar’s comparison test95 on these populations of paired measurements of ϕ using the autorank package96. The test is conducted for M populations (M is the number of compared conditions) with N paired samples (N is the number of neurons). The family-wise significance level of the tests is α = 0.05. First, we test the null hypothesis that the population is normal for each population. This null hypothesis was rejected for at least some populations in all our tests (detailed summary of the results in Supplementary Note 1.9). Since we have more than two populations and some of them are not normal, we use the non-parametric Friedman test as an omnibus test to determine if there are any significant differences between the median values of the populations. We use the post-hoc Nemenyi test to infer which differences are significant. We report the median, the median absolute deviation, and the mean rank among all populations over the samples (Supplementary Tables 15). Differences between populations are significant if the difference of the mean ranks is greater than the critical distance CD of the Nemenyi test.

For comparison between attention conditions in V4 data, we estimated ϕ separately on correct trials of each attention condition during the cue period (from 400 to 2300 ms aligned to cue onset), combined across different stimulus orientations. For comparison between two-choice and four-choice decision tasks in LIP data, we estimated ϕ separately for each task during the decision epoch (from 0 ms to the minimum of 1200 ms or reaction time aligned to stimulus onset), combined across different coherence levels and chosen sides. For comparison between fixation and decision epochs in LIP data, we estimated ϕ separately during the fixation (from  −200 to 0 ms aligned to stimulus onset) and decision epochs, combined across two-choice and four-choice decision tasks. For comparison between coherence levels in PMd data, we estimated ϕ separately for each coherence level during the decision epoch (from 0 ms to the minimum of 500 ms or reaction time aligned to stimulus onset), combined across the left and right chosen sides. For comparison between left and right choices in PMd data, we estimated ϕ separately on left and right choice trials during the decision epoch, combined across coherence levels. For comparison between fixation and decision epochs in PMd data, we estimated ϕ separately during the fixation (from  −600 to 0 ms aligned to stimulus onset) and decision epochs, combined across coherence levels and chosen sides.

The Friedman test failed to reject the null hypothesis that there is no difference in the central tendency of the populations for comparison between attention conditions in V4 data (p = 0.10, M = 4, N = 237, Supplementary Table 1), two-choice and four-choice decision tasks in LIP data (p = 0.109, M = 2, N = 60, Supplementary Table 2), fixation and decision epochs in LIP data (p = 0.84, M = 2, N = 45, Supplementary Table 3), coherence levels and left and right choices in PMd data (p = 0.11, M = 14, N = 272, Supplementary Table 4). Wilcoxon’s signed-rank test rejected the null hypothesis that there is no difference in the central tendency of the populations for comparison between fixation and decision epochs in PMd data (p = 3.35  10−12, M = 2, N = 262, Supplementary Table 5).

Spiking neural network model

We simulated the three-layer spatial balanced spiking network model using the parameters and code from ref. 35. The network consists of three layers. Layer 1 contains Nf = 2500 excitatory neurons that generate spikes as independent Poisson processes at a uniform rate fin = 10 Hz. Layers 2 and 3 are recurrently connected networks, each comprising Ne = 40,000 excitatory (α = e) and Ni = 10,000 inhibitory neurons (α = i). All neurons (NfNe, and Ni) are uniformly distributed on a unit square. Layer 1 provides feedforward input to Layer 2, where each neuron in Layer 1 connects to exactly \({P}_{{{\rm{fe}}}}^{(2)}\cdot {N}_{{{\rm{e}}}}\) excitatory and \({P}_{{{\rm{fi}}}}^{(2)}\cdot {N}_{{{\rm{i}}}}\) inhibitory neurons in Layer 2, with connection probabilities \({P}_{{{\rm{fe}}}}^{(2)}=0.1\) and \({P}_{{{\rm{fi}}}}^{(2)}=0.05\). Only excitatory neurons in Layer 2 project to Layer 3, where each neuron connects to exactly \({P}_{{{\rm{fe}}}}^{(3)}\cdot {N}_{{{\rm{e}}}}\) excitatory and \({P}_{{{\rm{fi}}}}^{(3)}\cdot {N}_{{{\rm{i}}}}\) inhibitory neurons, with \({P}_{{{\rm{fe}}}}^{(3)}=0.05\) and \({P}_{{{\rm{fi}}}}^{(3)}=0.05\). Both Layer 2 and Layer 3 are recurrently connected. Within each layer, every excitatory neuron connects to exactly PeeNe excitatory and Pei Ni inhibitory neurons, while every inhibitory neuron connects to exactly PieNe excitatory and PiiNi inhibitory neurons. The corresponding connection probabilities are Pee = 0.01, Pii = 0.04, Pei = 0.03, and Pie = 0.04.

For each population, neurons are uniformly distributed on a unit square where the position (xj, yj) of neuron j (1 j N) is \(0\, \leqslant \, {x}_{j}=\frac{1}{\sqrt{N}-1}\cdot \,{{\mbox{mod}}}\,(\,\,j-1,N)\), \({y}_{j}=\frac{1}{\sqrt{N}-1}\cdot \,{{\mbox{floor division}}}\,(\,\,j-1,N)\, \leqslant \, 1\), where N is the total number of neurons in that population. The probability of a synaptic connection depends on the distance between units. Unit i from group α {f,e,i} at location (xiyi) connects to unit j from group β {f,e,i} at location (xjyj) with the probability \({{\mathbb{P}}}_{\alpha }(i,j)=f({x}_{j}-{x}_{i},{\sigma }_{\alpha })f({y}_{j}-{y}_{i},{\sigma }_{\alpha })\), where

$$f(r,{\sigma }_{\alpha })=\frac{1}{{\sigma }_{\alpha }\sqrt{2\pi }}{\sum }_{k=-\infty }^{\infty }\exp \left[-\frac{{(r+2k)}^{2}}{2{\sigma }_{\alpha }^{2}}\right]\,,$$
(22)

with −1 r 1, \({\sigma }_{{{\rm{f}}}}^{(1)}=0.05\), \({\sigma }_{{{\rm{e}}}}^{(2)}={\sigma }_{{{\rm{i}}}}^{(2)}=0.1\), \({\sigma }_{{{\rm{f}}}}^{(2)}=0.1\), and \({\sigma }_{{{\rm{e}}}}^{(3)}={\sigma }_{{{\rm{i}}}}^{(3)}=0.2\). A presynaptic neuron can form multiple synaptic connections with a single postsynaptic neuron.

Each neuron is an exponential integrate-and-fire model, in which the membrane potential follows the dynamics:

$${C}_{m}\frac{d{V}_{j}^{\beta }}{dt}=-{g}_{\beta }\left({V}_{j}^{\beta }-{E}_{L}\right)+{g}_{\beta }{\Delta }_{T}{e}^{\frac{{V}_{j}^{\beta }-{V}_{T}}{{\Delta }_{T}}}+{I}_{j}^{\beta }(t).$$
(23)

gβ is the membrane conductance of neurons of type β. Cm is the membrane capacitance. EL is the resting potential. VT is the spike initiation threshold, the voltage level at which the neuron begins to exhibit rapid depolarization, leading to spike generation. ΔT is the sharpness parameter. It controls the steepness of the exponential rise in the membrane potential as the neuron approaches the threshold VT. When \({V}_{j}^{\beta }\) exceeds a threshold Vth, the neuron emits a spike, and the membrane potential is then set to a fixed value \({V}_{{{\rm{re}}}}\) for a duration of the refractory period τref. The total input current received by neuron j is

$$\frac{{I}_{j}^{\beta }(t)}{{C}_{m}}={\sum}_{\alpha \in \{\,{\mbox{f,i,e}}\,\}}{\sum }_{k=1}^{{N}_{\alpha }}\frac{{J}_{kj}^{\alpha \beta }}{\sqrt{{N}_{{{\rm{e}}}}+{N}_{{{\rm{i}}}}}}{\sum}_{s}{\eta }_{\alpha }\left(t-{t}_{s}^{\alpha k}\right)+{\mu }_{\beta }\,.$$
(24)

Here \({J}_{kj}^{\alpha \beta }\) is the synaptic weight from neuron k to neuron j, where β indicates the type of postsynaptic neuron j and α indicates the type of presynaptic neurons. μβ is the static current injected into neurons of type β. The times \({t}_{s}^{\alpha k}\) indicate the time of sth spike of neuron k from the population α. The postsynaptic current triggered by a single spike is

$${\eta }_{\alpha }=\frac{1}{{\tau }_{{{\rm{d}}}}^{\alpha }-{\tau }_{{{\rm{r}}}}^{\alpha }}\left\{\begin{array}{ll}{e}^{-\frac{t}{{\tau }_{{{\rm{d}}}}^{\alpha }}}-{e}^{-\frac{t}{{\tau }_{{{\rm{r}}}}^{\alpha }}}\quad &t > 0,\\ 0\quad &t < 0,\end{array}\right.$$
(25)

where \({\tau }_{{{\rm{r}}}}^{\alpha }\) and \({\tau }_{{{\rm{d}}}}^{\alpha }\) are the synaptic rise and decay time constants, respectively, for population α. The feedforward synapses from Layer 2 to Layer 3 consist of both fast and slow components ηF(t) = 0.2 ηe(t) + 0.8 ηs(t), where ηs(t) has the same form as Eq. (25) with a rise time constant \({\tau }_{r}^{s}=2\) ms and a decay time constant \({\tau }_{d}^{s}=100\) ms. All other parameters of neurons are provided in Table 1 and the connectivity parameters are provided in Table 2.

Table 1 Single neuron parameters in the spiking network model
Table 2 Connectivity parameters in the spiking network model

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.