Abstract
Classification of single neurons at a brain-wide scale is a way to characterize the structural and functional organization of brains. Here we acquired and standardized a large morphology database of 20,158 mouse neurons and generated a potential connectivity map of single neurons based on their dendritic and axonal arbors. With such an anatomy–morphology–connectivity mapping, we defined neuron connectivity subtypes for neurons in 31 brain regions. We found that cell types defined by connectivity show distinct separation from each other. Within this context, we were able to characterize the diversity in secondary motor cortical neurons, and subtype connectivity patterns in thalamocortical pathways. Our findings underscore the importance of connectivity in characterizing the modularity of brain anatomy at the single-cell level. These results highlight that connectivity subtypes supplement conventionally recognized transcriptomic cell types, electrophysiological cell types and morphological cell types as factors to classify cell classes and their identities.
Similar content being viewed by others
Main
A mammalian brain is a complex network of tens of millions neurons and supporting cells that work together to carry out its functions1,2. Cataloging brain-wide neuron types is a way to understand the structural and functional organization of the brain3,4. Advances in classifying neurons at the whole-brain scale usually rely on four major types of attribute: anatomical5,6, physiological7, morphological3,4 and molecular8,9. The classification of individual neurons based on connectivity at the whole-brain scale or in partial brain regions has primarily been constrained to electron microscopy (EM)-based imaging techniques.
Large-scale data acquisition and analyses at single-neuron resolution have identified neurons on the basis of their three-dimensional (3D) registered soma locations to the standard brain atlases. However, such soma-location cell types (s-types, defined by the anatomical region where a soma is located), provide only an anatomical reference of the respective neurons, with marginal indication about the structural, physiological, molecular and other attributes of neurons. Techniques10 to record electrophysiological, morphological and transcriptional properties of individual neurons have generated massive resources of these data modalities11 that facilitate the classification of various s-types.
The description of the morphology of neurons has been a crucial force to advance neuroscience since the time of Cajal12. While high-resolution digital reconstruction of the 3D morphology of neurons is challenging13,14, efforts in large-scale semi-automatic reconstruction have yielded substantial datasets for whole mouse brains3,4,15 and other complex primate brains16. A comparative approach taking advantage of these resources will facilitate understanding of the morphological classification and distribution of single neurons. We also envision that an objective comparison of the morphology of neurons, especially from various data sources, be carried out in a standardized coordinate system of an entire brain. Advances in brain mapping and registration17 provide such an opportunity.
Recent neuroscience research has highlighted the urgent need and various approaches to study neuronal connectivity and the whole-brain connectome18,19. The MICrONS Consortium has produced a number of analyses on the connectivity of cortical neurons using EM datasets20,21. Parallel efforts also include the EM-based reconstruction and analysis of the Drosophila hemibrain that have made limited use of the cell connectivity to define cell types22. However, the EM approach has not yet scaled up to a whole mouse brain, which motivated us to take an alternative approach. Indeed, the connectivity of neurons has to be mediated by their morphology, making it challenging to study at the whole-brain scale. Although it is clear that neurons can be classified on the basis of their regional projection and connectivity, a quantitative study of the connectivity types of neurons in mammalian brains has been challenging.
The goal of this study is to make an initial attempt to define neuronal connectivity in the context of the whole brain on the basis of high-throughput single-neuron morphology reconstruction datasets. In particular, we analyzed data with axonal and dendritic reconstructions of all brain regions to catalog potential connectivity subtypes of anatomically defined neurons. We did so by inferring potential axon–dendrite connectivity of single neurons for standard whole-brain anatomical regions. We further investigated the role of such single-neuron connectivity and found that the connectivity subtypes supplemented with the conventionally recognized transcriptomic cell types (t-types), electrophysiological cell types (e-types) and morphological cell types (m-types), as a way to differentiate cell classes and identities. We showed that a combined dataset of axons and dendrites registered to a standard brain atlas at the whole-brain scale led to quantitative modeling of the potential connectivity of individual neurons. Our finding underscores the importance of potential connectivity in characterizing the modularity of brain anatomy as well as the cell subtypes.
Results
Whole-brain map of neuron arbors and projections
To formalize terminology, we call each group of neurons whose somas are in the same brain region a soma type, or s-type. A neuron type determined on the basis of clustering of morphological features, or m-features, is called a morphology type, or m-type. Many morphological features, such as length, surface area and number of branches of a neuron, are independent of a neuron’s spatial orientation, while other m-features, such as width and height, may be associated with a neuron’s orientation. In the context of this work, m-types are defined as groups of cells that can be clustered on the basis of m-features. A neuron type determined on the basis of clustering of connectivity features or profiles, or c-features, of neurons is called a connectivity type, or c-type. All c-features are orientation independent, and thus c-type is not associated with a neuron’s orientation. Of note, connectivity is often associated with a topological direction, that is, where axons project to and where neuronal input signals come from. Morphology features do not immediately exhibit such a directionality except partitioning into dendritic and axonal arbors. Any quantifiable neuronal connectivity must be based on an anatomically precise mapping of individual neurons’ morphology at the whole-brain scale. Thus, c-types are associated with m-types but also involve multiple neurons, neuron populations and brain regions, in a standardized manner. In the following analyses, the term c-types may represent potential or predicted connectivity of neurons to certain brain regions. However, many of these have been cross-validated through our literature review (Supplementary Table 1). For the sake of clarity, we will prepend adjectives such as ‘potential’ to the names when necessary to avoid confusion.
To effectively study s-types, m-types and c-types, we built a comparative morphology neuron database that consists of 20,158 neuron-morphology reconstructions (Fig. 1). We first aggregated four state-of-the-art neuron reconstructions datasets from independent sources (Extended Data Table 1). In detail, we specifically generated 3D dendritic morphology of 10,860 neurons, called DEN-SEU (Fig. 1a,b, Extended Data Fig. 1 and Supplementary Fig. 1), to complement the full morphologies (complete axons and dendrites) of 1,741 neurons in the BRAIN Initiative Cell Census Network (BICCN) Allen Institute for Brain Science/Southeast University – Allen Institute Joint Center (AIBS/SEU-ALLEN) neuron morphology dataset3 and 1,200 neurons in the Janelia MouseLight project dataset4 (Supplementary Fig. 1), and the axonal morphology of 6,357 neurons generated by the Institute of Neuroscience – Chinese Academy of Sciences (ION)15. For fair and comprehensive analyses, we cross-validated neuron morphologies (Supplementary Figs. 2 and 3) to avoid potential systematic bias favoring one particular way in generating the respective neuron data. Moreover, we applied 3D brain registration to all these neurons to map them onto the same spatial coordinate system, Allen mouse brain Common Coordinate Framework, CCFv3 (ref. 23), so that all these neurons’ soma locations and 3D morphologies can be compared against each other directly.
a, Standardized 3D locations of 20,158 neurons in this study, pooled in two cohorts, that is, one with full axon reconstructions (n = 9,298) and one with local dendritic reconstructions (n = 10,860) that covers all brain regions. The bar chart shows the soma density (r_soma, per mm3) in main brain regions (see the Methods for abbreviations). b, Thirty-eight full neuron reconstruction examples with different arborization patterns innervated from 8 brain regions (CTXsp, HPF, isocortex, OLF, PAL, STR, TH and P; see the Methods for abbreviations) and 20 dendritic reconstructions in different brain regions. The bar chart shows the arbor density (r_arbor, per μm3) for major brain regions. c, Connection examples indicated by the projection patterns of MOp cells to SSp, SSp-ll, SSp-ul, SSs, CP, MOs, PO, PG, MY, MY-sen and MY-mot regions. The bar chart shows a histogram of outgoing and incoming connections of brain-wide projections; r_proj is the ratio of the number of neurons passing through a specific number of brain regions normalized against the total number of neurons. d, A whole-brain arborization map of 2,941 neurons. A similar map for all 9,298 neurons with axons was also produced (Supplementary Fig. 7). The horizontal axis indicates the soma location of single cells (n > 10). The vertical axis indicates the arbor projection regions, with an average projection length >2,000 μm, which are also grouped into larger brain areas. The size of circles represents the arbor length in the brain regions. The color bar shows the ratio of local and distal arbors relative to soma locations.
The somas of the DEN-SEU neurons are distributed fairly evenly across major brain regions, while the other three full morphology or axon datasets focus on specific brain regions of the cerebral cortex, thalamus, striatum, hypothalamus, hippocampus and claustrum (Fig. 1a). Compared with the sparse and long projection of axons in the other datasets, the dendritic arbors of DEN-SEU cover all CCFv3 brain regions (Fig. 1b), making this dataset suitable for analyzing the target projection/connection regions of axons of neurons. We also cross-validated the quality and distribution of our assembled neuron data with other public documented neuron morphologies shared by independent laboratories (Supplementary Figs. 4 and 5 and Supplementary Table 2) via NeuroMorpho.Org24,25. At the same time, the projection pathways of aggregated full neuron and axon data in this study capture many regional connections, such as neurons originating from the primary motor area (MOp) (Fig. 1c) and other major brain regions from the cortex, hippocampus, striatum and thalamus (Supplementary Fig. 6). The total number of brain regions reached by projection axons follows a broad distribution (Fig. 1c), indicating that most axons normally project to relatively distal regions. By contrast, dendrites extend a much shorter distance, invading at most five or six brain regions near their soma anatomical locations (Fig. 1c and Supplementary Table 3). The large number of neurons involved in this study form complex patterns of potential connectivity, which should be quantified and analyzed in a principled way. We tackled this challenge by considering the modularity and granularity of individual neurons.
Neuron arbors often correlate with regions of dense connections between neurons. Therefore, we used a machine learning method, AutoArbor3, to determine the topologically connected arborization regions of neurons automatically. We started with 2,941 fully reconstructed neuron morphologies in the Allen/SEU-ALLEN and MouseLight datasets to produce a brain-wide arborization map of a mouse brain (Fig. 1d). In this way, various neuronal pathways indicated in our datasets (Fig. 1c) are quantitatively modeled. For example, we observed clear modules of projection and potential connection patterns in large brain regions such as the isocortex, striatum and thalamus. This motivated us to characterize the potential connectivity among neurons using the structural components, that is, neurite arbors, systematically. Accordingly, in total, we generated 26,205 axonal arbors and 20,158 dendritic arbors (Fig. 1b) for all neurons in this study. We subsequently used these arbors to define the connectivity among neurons and respective c-types.
Connectivity profiles augment morphology-based neuron types
Distinct from morphology analyses of neurons that rely on various m-features, such as the Sholl analysis26, L-Measure27 and extended global or local structural features28, we study cell typing by generating the neuron connectivity features that capture the relationships among individual neurons. One approach to quantifying single-neuron connectivity is based on axon–dendrite colocalization that needs precise details on synaptic contact location approximation29, which, however, is still challenging at the whole-brain scale. Our approach is to use soma locations and defined spatial domains of neuronal arborization. Specifically, we determined the connection targets of a neuron based on the 3D registered brain regions invaded by its axonal arbor, and the connection strength based on the spatial adjacency of this neuron’s axonal arbor and nearby dendritic arbors of neurons in our dataset (Fig. 2a). We detected arbor domains of neurons that originated from a specific brain region using Gaussian mixture models (GMMs) (Fig. 2a) and produced spatially and statistically optimal parcellation of projection sites of all s-types. Within each arbor domain, the arborization pattern of each group of neurons of the same s-type was approximated using a spatially homogeneous Gaussian distribution. For example, somatosensory cortex neurons were found to have nine arbor domains, five of which contained the respective somas, so they were called the dendritic domains (Fig. 2b). The remaining arbor domains that were far away from somas and also contained axons were called axonal domains.
a, A schematic overview of the definition of arbor domains and potential connectivity. Left: neurons are categorized into soma-location types (s-types) based on their cell body location in the CCFv3 anatomical region. Within each s-type, morphological coordinates are clustered using a GMM, forming arbor domains. A dendritic arbor domain contains a major number of somas. Right: overlapping voxels between axonal and dendritic domains define the potential connectivity. b, A schematic illustration of dendritic arbor domains for SSp neurons in middle sections of the CCFv3 atlas outline (left, coronal half-view; right, sagittal half-view). c, A heat map of potential connectivity for VPM neurons, which project to SSp heavily. The horizontal axis indicates the dendritic domains (as indicated by the prefix ‘d’) with renumbered identifiers denoting the domain center coordinates in b (see Supplementary Table 4 for a complete list of domains); only the top-25 domains with the greatest variances are shown for clarity, while the entire feature vector was used in clustering. The vertical axis indicates the clustered VPM neurons. The dendrogram on the left shows hierarchical clustering of the potential connectivity feature vectors of neurons. The orange lines indicate cluster boundaries in the heatmap. The color bar shows the number of overlapping voxels between a neuron of interest and dendritic domains. d, Horizontal view of VPM neurons overlaid on the CCFv3 contour colored by the clusters obtained from potential connectivity. e, A comparison of clustering results based on morphology features only (top) and based on joint feature vectors by concatenating morphology and connectivity features (bottom). Top left: a scatter plot of MOp, SUB and VPL s-types. The horizontal axis indicates the total length of the neurons in μm. The vertical axis indicates the maximum branch order. Bottom left: 3D scatter plot of the total length, maximum branch order and first component of a PCA of the potential connectivity matrix. The c-types obtained are colored with different shades of each s-type color. Right: heat maps showing the overlap between paired s-type point clouds in scatter plots, with color indicating the misclassification percentage using SVM classification.
For each neuron in a specific s-type, we then computed a connection barcode (Fig. 2c) as the features to characterize the axon–dendritic spatial overlap of its axonal arbors and dendritic arbor domains of all s-types at the whole-brain scale, all defined in the standard CCFv3 space. For DEN-SEU, we produced 19 dendritic arbor domains per brain hemisphere. We also produced another 56 dendritic arbor domains per brain hemisphere for other s-types with at least 60 reconstructed neurons. These dendritic arbor domains spanned an average volume of 8.94 mm3. The resultant connection barcode was thus a 150-dimensional feature vector for the entire brain, indicating how axons of neurons in a s-type would project and potentially connect to various dendritic domains in the context of whole-brain anatomy. With this barcode, neurons belonging to a s-type were further clustered. For instance, ventral posteromedial nucleus of the thalamus (VPM) neurons were clustered into four connection groups (Fig. 2c), which are visually separable from each other (Fig. 2d). In other examples, domains obtained in some s-types may distribute among different cortical layers (Supplementary Fig. 8), although the actual connectivity scores were dependent on the actual axon–dendritic spatial overlap. Of note, our definition of dendritic domains was based on pooling single-neuron tracings in a large dataset, which we used to identify separable subregions with a high density of dendrites. Our arbor domains differ from brain regions based on conventional, cytoarchitectonic features defined in widely used atlases (for example, Franklin-Paxinos Atlas30 and Allen Reference Atlas31).
To understand the advantage of the connectivity barcode, we first applied it to assisting with conventional morpho-analysis of cell types that clusters s-types or their subtypes based on m-features. It was difficult to separate MOp, subiculum and ventral posterolateral nucleus of the thalamus (VPL) neurons that have heavily overlapping m-features, as seen in both the overlap scores and the feature scatter plot (Fig. 2e). However, when the connectivity features were appended to the m-feature vectors to cluster these three s-types, they became clearly separable in terms of a minimal overlapping in this case (Fig. 2e). When the first principal component of the connectivity features was added in visualization, the separation of the three s-types was visible (Fig. 2e). This shows that the connectivity features help to discriminate neuron classes, similar to the dimension-increment analysis or support vector machines (SVMs)32,33 in pattern recognition and machine learning, where nonseparable classes could become distinguishable in higher-dimensional spaces.
To further assist the above analyses, we examined the concrete examples of VPM neurons (Supplementary Fig. 9) to confirm their separation (Fig. 2c,d). We found that with connectivity features we were able to produce clear clustering that was not manifested in the respective soma locations (Supplementary Fig. 10). In addition, our principal component analysis (PCA) of the full morphological feature set suggests that these MOp, subiculum and VPL neurons could not be easily separated on the basis of morphology features alone (Supplementary Fig. 11). Our conjugated analysis, by incorporating connectivity features, led to a clearer separation of these three s-types (Fig. 2e). We systematically compared the difference in overlap when using morphological or connectivity features among all pairs of s-types or connectivity subtypes. Our analysis showed that classification overlap was lower when using connectivity features compared with morphological features (Supplementary Fig. 12, two-sided Wilcoxon test P = 2.5 × 10−7 for s-types and P < 2.2 × 10−16 for c-types).
c-types outperform m-types in neuron classification
To investigate whether c-features would classify cell types better than conventionally used m-features3,5, instead of providing auxiliary dimensions to assist cell typing, we computed the similarity scores of morphological features (m-score) of all 31 known s-types (n > 60) (Fig. 3a). Except a small amount, that is, 25.8%, of s-types that have relatively low similarity in their m-features, the majority of s-types (74.2%) make up three boxed cohorts, within each of which neurons of different s-types share high-similarity m-features (Fig. 3a). As a quality control, we checked the similarity scores between the c-features (c-score) of all three cohorts of s-types. The scores for the three cohorts were reduced, while the c-scores of the other eight s-types remained low (Fig. 3b). This pattern was also observed in comparing m-scores and respective c-scores calculated on the basis of dendritic or full-neuron morphological features (Extended Data Fig. 2). In other words, in general, a s-type is well separated from other s-types in the space of c-features.
a, Clustering based on similarity score of morphology features, that is, m-score, of 31 s-types (n > 60 in each) in cortex, thalamus and striatum. To ensure uniformity in the analysis of all neurons, including those without dendritic reconstructions, we focused exclusively on axons for computing morphological features. The yellow boxes indicate three cohorts of s-types that share similar morphology features. The values in the matrix indicate the normalized similarity between 0 and 1. b, The similarity score of connectivity features, that is, c-scores, sorted using the same order of s-types as in a. The color bar shows the normalized similarity among features (also the same as in a). c, The ratio matrix of the c-score in b over the m-score in a. d, Joint and marginal distributions of corresponding c-scores and m-scores for all pairs of s-types and boxed pairs in a and b. e, A histogram of c/m-score ratios in c. f, A paired comparison of UMAP clustering of s-types using either morphology or connectivity features, corresponding to the 12 smallest c/m-score ratios in c.
We further directly compared corresponding m-scores and c-scores to quantify the improvement in the cell-typing performance of connectivity features over morphological features. Here, 76% of entries in the ratio matrix of c-scores and m-scores (Fig. 3c–e) are less than 1, while 99% of such entries corresponding to the boxed cohorts are less than 1. We also visualized the actual clustering of neurons based on either morphological features or connectivity features. Examination of the paired Uniform Manifold Approximation and Projection (UMAP) clustering for the 12 smallest ratios of c-scores and m-scores shows that c-features are more separable than m-features (Fig. 3f). For example, anterior cingulate area ventral part–layer 5 (ACAv5) neurons have mixed m-features with agranular insular area dorsal part–layer 2/3 (AId2/3) neurons, but their c-features are clearly separable (Fig. 3f and Supplementary Fig. 13). This is also the case for MOs layer 5 neurons versus infralimbic area–layer 5 (ILA5) neurons, orbital area medial part–layer 2/3 (ORBm2/3) neurons versus ACAv5 neurons and all other visualized pairs of cell types, although all these cases have varying distributions in their UMAP space (Fig. 3f). As our results analyze the largest neuron archives for the mouse brain containing major neurons classes, it is reasonable to conclude that c-features could serve as strong contenders of m-features for cell typing of neurons whose somas are from well-established anatomical regions.
Connectivity correlates with spatial separation of cell subtypes
We investigated whether c-features would also help to identify subtypes of neurons that share their soma locations in the same anatomical area. We generated a distance map (d-map) to measure the spatial separation of two neurons based on their soma locations (Fig. 4a). Because within any specific brain region neurons were labeled in a stochastic way, the pairwise soma distance may form a Gaussian-like or Gaussian-mixture distribution (Fig. 4b). In particular, when somas scatter almost uniformly within a brain region, their pairwise distance will be close to Gaussian, such as the dorsal part of the lateral geniculate complex (LGd) and caudate putamen (CP) neurons (Fig. 4b). Conversely, when somas form two or more subclusters within a region, their pairwise distances may form a distribution with a long tail, or approximately a Gaussian mixture distribution, such as the anterior cingulate area dorsal part –layer 6a (ACAd6a) neurons in this database (Fig. 4b). Correlating the morphology similarity scores (m-scores) and potential connectivity similarity scores (c-scores) with a d-map provides a way to understand which kinds of features may help to identify subtypes of neurons whose somas are from subareas in an established s-type.
a, A pairwise soma-distance map of CP neurons biclustered on the basis of the spatial adjacency of somas mapped to CCFv3. b, Histograms of the pairwise soma distances for neurons in CP, ACAd6a and LGd regions. c, Matrices of morphology-feature similarity scores (m-similarity) and connectivity-feature similarity scores (c-similarity) of individual CP neurons, rows and columns sorted in the same order as in the clustered d-map in a. Cosine similarity scores are used. d, Connectivity-feature-based clustering of CP neurons into three main subclasses (red, blue and green), in the same convention as in Fig. 2c. e, A 3D visualization of the three CP neuron subclasses in d. f, The 3D soma locations of CP neurons. Color indicates the largest Euclidean distance between axonal terminals and the respective soma. g, LGd neuron d-map and respective m- and c-similarity matrices, rows and columns sorted in the same order. h, ACAd6a neuron d-map and respective m- and c-similarity matrices, rows and columns sorted in the same order. i, Histogram of the difference between corresponding c- and m-similarities for all neurons of the 31 s-types in this study. j, Scatter plot and marginal distributions of corresponding c- and m-similarities for all neurons in the 31 s-types (linear least-squares regression, R2 = 0.245; Pearson correlation coefficient 0.495, two-sided ***P < 0.0001, with the precise value being P < 2.2 × 20−16). k, Correlations between soma-distance map and c- or m-similarity for all 31 s-types. Corr, Pearson correlation coefficient. l, Overall correlations between neurons’ soma distances and the respective similarities in connectivity features or morphology features. l_m_corr, correlation between location distances and m-similarities; l_c_corr, correlation between location distances and c-similarities. The center line indicates the median (50th percentile). The box bounds represent the interquartile range (IQR) between the 25th (Q1) and 75th (Q3) percentiles. The whiskers extend to values within 1.5× IQR. Outliers beyond whiskers are shown as individual markers. Minimum and maximum values are within whisker bounds (two-sided paired-samples t-test, sample size: n = 31, **P = 5.30 × 10−4).
In the example of CP neurons, we calculated the pairwise m-score and c-score matrices (Fig. 4c) sorted in the same order of neurons as in the respective d-map (Fig. 4a). Using the c-features, we obtained three major CP clusters (Fig. 4d) with different projection and arborization patterns (Fig. 4e), although their somas colorized with morphological features were mixed fairly uniformly (Fig. 4f), while there was no obvious subcluster based on the similarity matrix of the m-features (Fig. 4c). Similarly, we computed the d-maps and respective m-scores and c-scores matrices for LGd and ACAd6a neurons (Fig. 4g,h). The Gaussian-mixture-like distribution of the pairwise neuron distances of ACAd6a neurons also translated to potential clusters in the ACAd6a d-map (Fig. 4h), while the single Gaussian-like distributions of CP and LGd neurons (Fig. 4b) corresponded to the less clear hierarchical clustering of the respective sorted d-maps (Fig. 4a,g).
We computed the corresponding d-maps and m-score and c-score matrices for all 31 s-types of neurons. We found that, for any pair of neurons, the c-scores were only slightly greater than the m-scores (Fig. 4i). There was a positive correlation between these two scores (Fig. 4j), of which m-scores followed a flatter marginal distribution than c-scores (Fig. 4j); this indicated that, statistically, it would be harder to produce clearly segregated neuron clusters based on morphology similarity. However, the corresponding entries of the d-map and c-score matrices had evidently negative correlation, which was also stronger than that between d-map and m-score entries (Fig. 4k,l and Extended Data Table 2). Only 6 out of 31, or 19.4%, s-types showed a stronger negative correlation of soma-location-and-morphology similarity compared with soma-location-and-connectivity correlation (Fig. 4k). Neurons with far-away soma locations could be, at most, four times more likely to have different c-features than m-features (Fig. 4l). Thus, we concluded that potential subtypes for an s-type were statistically better represented by c-features than by m-features.
Spatially tuned connectivity features identify cell subtypes
Anatomical subgrouping of neurons within a specific brain region reflects the spatial coherence of these cells. As c-features correlated more strongly with the spatial adjacency of neurons, for each s-type we combined connectivity profiles and spatial adjacency to cluster neurons and identify potential anatomical subtypes. We called this approach spatially tuned c-features, with which we produced clear subtyping of neurons (Fig. 5 and Extended Data Fig. 3) that we had not been able to identify using alternative methods.
Top rows: biclustered spatially tuned connectivity similarity matrices, where different colors along the x and y axes indicate the clusters, and the index numbers of neurons in a specific s-type are shown on both x and y axes. Middle rows: triview visualization of neurons in CCFv3; neurons are rendered in the same colors as in the respective top-row clusters. A, anterior; P, posterior; D, dorsal; V, ventral; L, left; R, right. Bottom rows: dataset composition in each cluster. The percentage and neuron count of each dataset are labeled. D1, D2 and D3 refer to AIBS/SEU-ALLEN, MouseLight and ION, respectively. a–c, Subtyping of PL neurons for layer 2/3 (PL2/3) (n = 188) (a), layer 5 (PL5) (n = 795) (b) and layer 6a (PL6a) (n = 99) (c). d–f, Subtyping of MOs neurons in MOs layer 2/3 (MOs2/3) (n = 218) (d), layer 5 (MOs5) (n = 359) (e) and layer 6a (MOs6a) (n = 116) (f). g–i, Subtyping of thalamic neurons in VPL (n = 91) (g), VPM (n = 406) (h) and LGd (n = 78) (i).
For cortical neurons (Fig. 5a–f), we found that neurons in the prelimbic area had two subtypes for each of the layers 2/3 (Fig. 5a), layer 5 (Fig. 5b), and layer 6a (Fig. 5c), respectively. Neurons in layers of the secondary motor area (MOs) could also be clustered into subgroups (Fig. 5d–f). The layer 2/3 MOs neurons were clustered into two large subgroups indicated by the sorted distance matrix, along with distinct projection patterns of these subgroups in the cross-sectional views of the CCFv3 space (Fig. 5d). Similarly, layer 5 and layer 6 MOs neurons were divided into two subgroups, respectively (Fig. 5e,f). Detailed examination of these MOs subtypes provided guidance for analyzing connectivity-based subtypes of cortical neurons.
We also attempted to identify subregions in the thalamic gateway related to sensory and motor input, particularly VPL (Fig. 5g), VPM (Fig. 5h) and LGd (Fig. 5i). We found that subregions of somas in these areas corresponded to neurons projecting to distinguishable spatial targets, visualized often as homogeneous color blobs of neuron subclusters, which were particularly clear in the three subtypes of VPM neurons (Fig. 5g). LGd has three known anatomical subregions34,35, that is, LGd-shell, LGd-core and LGd-ip (ipsilateral zone). We found two major distinguishable subtypes of somas using our approach, which might provide further spatial granularity to study the previously documented subregions. Of note, LGd neurons could not be clearly clustered using either morphology features or connectivity or spatial distance features alone (Fig. 4g). Our clustering of all other s-types indicated similar results (Extended Data Fig. 3), highlighting the power of spatially tuned c-features.
Subtyping neurons reveals diversified cellular characteristics
MOs neurons have long axonal projections that subserve animal decisions36. In addition to individual neurons’ spatial patterning, we profiled the symmetry of MOs connectivity using the cortical layer 5 neurons. To do so, we kept the somas separated when calculating their space d-map in the respective hemispheres. Our examination of individual MOs neurons confirmed long-range projection targets at the full-brain scale (Fig. 6a). The overall projection patterns of these MOs neurons were consistent with the previously documented population projection37 (Fig. 6a). We found that the somas in MOs5–subtype 1 (MOs5_1) and those of MOs5–subtype 2 (MOs5_2) and MOs5–subtype 3 (MOs5_3) clusters distributed on the two sides of the brain (Fig. 6a), while the somas in MOs5_2 and MOs5_3 essentially intermingled. The projection patterns of MOs5_1 matched well with the mirrored sum pattern of MOs5_2 and MOs5_3. In other words, the spatially tuned connectivity analysis revealed both the anatomical distribution of neuron subtypes and their symmetry. While the reconstructions of neurons of this MOs5 dataset had three anatomical subtypes when both hemispheres of the brain were considered, there were only two genuine subtypes (Fig. 5e) that were distributed symmetrically on the brain’s coronal plane. These two subtypes might be further subdividable as implied in the respective clustering tree (Fig. 5e).
a, Three connectivity-based clusters for MOs5 neurons (top row) along with the distribution of their somas (bottom left) and the overall projection patterns of MOs5 neurons (bottom right)37. b, Key morphological features of the three connectivity-based MOs5 subtypes. The number of involved axonal morphometry samples is as follows: MOs5_1, 143; MOs5_2, 128; MOs5_3, 190. Similarly, the number of involved dendritic morphometry samples is as follows: MOs5_1, 143; MOs5_2, 128; MOs5_3, 190. c, Two metrics, neuron-beta and correlation coefficient, between single neurons and neuron populations in motor cortex, specifically MOp, MOs2/3, MOs5 and MOs6a subtypes. The number of single neurons and mesoscale projection experiments is indicated in brackets on the x axis. In b and c, the elements of box plots are consistent with those in Fig. 4l. d, Correlation of dendritic and axonal morphological features for MOs5 connectivity subtypes, along with examples of the first MOs5 cluster. Note that the clustered neurons in a might not have dendrite reconstructions; however, in this dendro-axonal correlation analysis only neurons in a but also with full dendrites and axons are counted. e, Transcriptomic profile-based single-neuron clustering of FRP-MOs neurons (n = 34,331) and more specific FRP-MOs layer 5 neurons (n = 9879), compared with the clustering based on connectivity and morphology features of FRP-MOs/FRP-MOs layer 5 neurons. Region labels in top left: CGE, caudal ganglionic eminence; L2/3 IT, intratelencephalic (IT) layers 2 and 3; L4/5/6 IT Car3, IT, Car3+, layers 4–6; MGE, medial ganglionic eminence; NP/CT/L6b, near-projecting(NP)/corticothalamic (CT)/layer 6b; PT, pyramidal tract.
We also examined both the axonal and dendritic morphologies of MOs5 subtypes. While the most dendritic features of the two genuine subtypes, MOs5_2 and MOs5_3, were similar to each other, their axonal features (Fig. 5e) were different in area, width and relative shift of centers, despite the similar numbers of axonal bifurcations. This means that, although these two subtypes had similar branching complexity, their projection patterns differed. Such variability of MOs5_2 and MOs5_3 was also seen in the different correlation and neuron-beta (ref. 3) values compared with the overall MOs population projection (Fig. 6c). The respective scores of MOs5-versus-population and MOs5_1-versus-population were comparable to each other, indicating that MOs5_1 was a good ipsilateral approximation of the overall MOs5 patterns, also as the sum reference for MOs5_2 and MOs5_3 (Fig. 6c). The MOs5 and MOs layer 2/3 neurons also covaried strongly with the MOs population projection. Differently, MOp neurons showed more variation in the single-neuron-versus-population comparison, while their integrative projection pattern also matched with previous population projection data37 (Fig. 6c). We also correlated the m-features of individual neurons’ dendritic and axonal arbors. For MOs5 neurons, m-features such as the number of bifurcations and total length showed a recognizable level of correlation, in the range of ~0.3–0.7, between dendrites and axons (Fig. 6d).
As the community has generated multimodal datasets3,17,23,38 that carry information on the anatomical and molecular attributes of individual neurons, we attempted to cross-validate these data with respect to neuronal connectivity. Specifically, we compared the transcriptomic subtypes of single MOs neurons38, connectivity subtypes and morphological subtypes (Fig. 6e). As the quality-controlled transcriptomic data of MOs and FRP (frontal pole of the cerebral cortex) were mixed due to the limited spatial resolution and it was difficult to separate cells from these two brain regions during the experiment, we prepared connectivity and morphological features of individual neurons in a similar way, specifically for layer 5 (Fig. 6e) and layer 2/3 (Extended Data Fig. 4a). Within each of these individual scenarios, we observed relatively coherent subtyping except for the cases of morphological features. For quantitative measurements and adaptation to conventional transcriptomic analysis methods, we used the Louvain algorithm to identify two to ten clusters for each modality’s data and calculated the modularity of each cluster result (Extended Data Fig. 4b). We found that connectivity features exhibited a higher degree of modularity than transcriptomic features in defining neuron subtypes when there were fewer than five clusters (Extended Data Fig. 4b). We did not observe a conclusive layer-by-layer correspondence between transcriptomic and connectivity subtypes.
Moreover, we performed a joint analysis of the m-type, c-type, t-type and e-type data based on retrieving the publicly available electrophysiological and transcriptomic recordings of single neurons that also fell into the brain regions used in this study. For the primary visual area, where neurons were found in both morphological and electrophysiological single-neuron datasets, we analyzed 47 fully reconstructed neuron morphologies and their regional connectivity patterns (Extended Data Fig. 5a), along with their morphometric features (Extended Data Fig. 5b) and anatomical locations of cell bodies (Extended Data Fig. 5c). We found that primary visual area neurons in different cortical layers had a less clear separation in morphology (Extended Data Fig. 5d) than in connectivity (Extended Data Fig. 5e). We also reanalyzed previous single-neuron electrophysiological recordings7 based on the concatenated e-type features (Supplementary Table 5) and colored these e-type data using their molecular profiles and anatomical locations (Extended Data Fig. 5f–g). While certain neurons in different layers had preferences in their physiological and molecular properties, there was a general disparity between such features and their c-types.
Subtyping VP nuclei indicates broader multisensory integration
By subtyping single-cell reconstructions of 390 VPM and 83 VPL neurons, we were able to document the broad regional connections of VP neurons in a comprehensive manner. First, we clustered individual neurons’ detailed projections onto cortical areas and layers into eight subtypes as a matrix (Extended Data Fig. 6a). These eight groups had similar separation of their soma locations as well as the respective axonal arbor targets’ locations (Extended Data Fig. 6b,c). The longest dendrites were about five times larger than the shortest dendrites in these groups (Extended Data Fig. 6d). We also confirmed the majority projection of VP neurons to layer 2/3 and 4 of somatosensory cortex (Extended Data Fig. 6a) consistent with previous knowledge at the neuron population level39,40,41,42,43. While our previous study3 implied that a small portion of VP projection might target MOp, the detailed examination presented in the next section (Extended Data Fig. 7) visualized abundant outgoing arborization of VP neurons in MOp regions. We estimated that a nonnegligible 20.7% of VP cells (n = 98) actually projected to multiple cortical areas such as motor or visceral areas that were outside somatosensory cortex, and even beyond CP (Extended Data Fig. 6a).
Furthermore, we found that a single VP neuron could target multiple sensory areas. For example, a VPM neuron can simultaneously project to supplemental somatosensory areas and subareas of the somatosensory cortex, such as mouth, nose, lower limb or upper limb (Extended Data Fig. 6a,e). Some VPL neurons even project to layer 1 in addition to layer 4 (Extended Data Fig. 6e). Such neurons carry two separate axonal clusters: a larger one projecting to VPL neuron’s typical projection target, that is, primary somatosensory area layer 4, and a smaller one targeting layer 1 of a different cortical area, such as supplemental somatosensory areas or even the visual cortex.
The surface area of some VPM cells’ axonal cluster in the barrel field of the somatosensory cortex (largest: 384,942 μm2) is twice that of a barrel (Extended Data Fig. 6f). Traditionally, it was believed that each VPM cell projects to only one barrel44. Our finding suggests potential signal regulation across multiple barrels; thus, the tactile sense signaling transmission could be a multithread process.
In addition, 18.6% of VP neurons (n = 88) possess small branches with bouton terminations in subcortical striatum, suggesting VP–striatum projections (Extended Data Fig. 6g). Our finding indicates a new pathway in the thalamic–subcortical circuit, supplementing the main pathway of VP nuclei to somatosensory cortex. Taken together, these single-cell VP reconstructions give clues to supplementary and complex signal transmission paths in multisensory integration circuits.
Subtyping target connections of thalamocortical neurons
In addition to the outgoing ‘forward’ connection patterns examined in preceding sections, we also investigated the diversity of incoming connections of a target brain region. Previous literature shows that the MOp receives thalamocortical projection from the sensory–motor relay nuclei VAL (ventral anterior-lateral complex of the thalamus) and the modulatory or high-order nuclei such as VM (ventral medial nucleus of the thalamus) or PO (posterior complex of the thalamus)45,46,47. Our analysis revealed additional connections from sensory relay nuclei VPM and VPL (Extended Data Figs. 6 and 7).
With the whole-brain mapped full reconstructions we produced, it can be seen that individual neurons from PO, VM and VAL project to motor and somatosensory areas as a whole spectrum of connectivity subtypes (Extended Data Fig. 7a,b). Projections of individual neurons display different layer preferences in thalamocortical areas. Such preference in MOp can be summarized as follows: PO neurons (n = 14) focus on mainly layer 2/3 (4/14), layer 2/3 and layer 5 (4/14) and layer 5 (6/14). VAL neurons (n = 34) have five main subtypes of connectivity projecting to (1) layer 1 (4/34), (2) layer 2/3 mainly (6/34), (3) layer 2/3 and layer 5 combined (6/34), (4) layer 5 mainly and layer 6 weakly (16/34) and (5) mainly layer 6a (2/34). VM neurons (n = 13) have several subtypes projecting to layer 1 (8/13) combined with weak projection into other layers, layer 2/3 and layer 5 (4/13) and all layers (1/13). VP neurons (n = 35) can be classified as subtypes including projections to layer 2/3 (30/35), layer 5 (3/35) and layer 6 (2/35), respectively (Extended Data Fig. 7b). Individual examples display axonal cluster phenotypes and projections (Extended Data Fig. 7c). Furthermore, we also performed hierarchical clustering based on either cortical region projections or target MOp layers of these thalamic neurons (Supplementary Fig. 14). Our results showed that VP cells were more distinguishable than cells originating from the other three nuclei VAL, VM and PO (Supplementary Fig. 14). Taken together, these new layer projections from individual thalamocortical neurons suggest fine regulations of the sensory–motor signal circuits. In conclusion, we have attached a figure to summarize the entire work (Extended Data Fig. 8).
Discussion
This work studies the whole-brain scale connectivity of single neurons using a large data archive, leveraging both new dendritic reconstructions that cover the entire brain and existing axonal and full reconstructions3,14,15. There are two remarkable topics in such an integrative approach. First, one may be able to study the building blocks of a brain, that is, organizational subtypes of individual neurons, in terms of connectivity. This work makes an initial attempt toward this end. We approach the cell-typing problem by subtyping neurons within anatomical categories (based on soma locations, referred to as s-types) and incorporating additional neuronal attributes, such as the connectivity features discussed in this study. Second, one may be able to construct and study the microscale connectome based on individual neurons, filling a gap between previous work at the population-level mesoscale connectome and the nanoscale connectome that relies on using EM and/or other super-resolution microscopy methods more suitable for examining synaptic level connections of neurons in potentially smaller, local brain regions.
To understand the potential connectivity of neurons throughout a brain at single-neuron resolution, it is essential to analyze the arborization of axons and dendrites in different anatomical areas. An overall axonal arborization distribution map provides an understanding of the marginal distribution of neurons that innervate from different regions and also highlights that arbors can be entities to study neuronal connectivity. To complete this conceptual approach, we produced brain-wide dendritic arbor domains, which were used to generate the connectivity profiles for each individual neuron. In this way, the connectivity features can be precisely defined and utilized for analysis. This approach therefore constitutes a contribution essential for whole-brain-scale single-neuron analysis. Although an ideal scenario for studying cell typing based on connectivity would involve comprehensive single synapse resolution whole-brain connectomes of mammalian brains, the current absence of such datasets should not deter the development of more pragmatic approaches, like the one we propose. Our findings endorse the notion that expanding efforts to trace single neurons fully can offer a valuable proxy for understanding c-types, with the potential for validation through independent sources. Embarking on the identification of subcellular synaptic structures within these datasets20,21 represents a promising strategy to refine our method further and address the constraints related to inferred connectivity, which previously relied on axon–dendrite colocalization analyses29.
Within a general framework of cell typing, our study demonstrates that morphology cannot accomplish this alone. Based on this study and also previous work3, as well as converging results from invertebrate nervous systems48, we hypothesize that t-types or e-types alone may also be insufficient, and it is an open question how to synergize all these data in a common connectivity framework. We believe that there are two key steps to address this challenge. The first is the generation of connectivity associated t-type, e-type and m-type data. A second step is building a thorough statistical model of all such data to mine the associations and distribution patterns, which could be homogeneous clusters or globally nonlinear manifold patterns49. This statistical model could help to discover genes or their sets, as well as their association patterns, enriched in biological processes or pathways.
This approach of leveraging connectivity-type analysis toward the determination and validation of neuronal cell types is powerful. It can be extended to brain-scale analysis of single neurons’ synaptic connectivity when data become available. An example can be seen in the single-cell c-types defined for a Drosophila brain22 that elaborate on the connection detail built upon morphological and lineage similarities. While such an approach provides the EM-based, ultrascale spatial resolution to precisely pinpoint synaptic connections, it is also subject to noise and imperfect process of data acquisition and computation, which would probably be exacerbated when applied to a much larger and complicated mammalian brain. The strength of the present approach is that we can readily study cell typing and subtyping using the arborization-based regional connectivity, without precise pinpointing of synaptic level connections. This may be valuable when considering that individual synapses are subject to turnover via structural plasticity, while arbor geometry provides a relatively more stable circuit scaffolding50. Connection types and subtypes can also provide a useful blueprint of future synaptic-level analysis. In summary, neuronal connectivity in mammalian brain provides a discriminant in the classification of neuronal cell types, refining and adding class information to existing and widely studied modalities.
Our current approach is subject to several limitations. First, c-types can be defined only for reconstructed axons, if not entire neurons. The more completely an axon is reconstructed, the more accurately its connectivity features can be extracted. However, achieving high reconstruction completeness presents practical challenges due to inherent limitations in current neuron labeling methods. Second, the accuracy of connectivity classification is dependent on the precision of spatial registration. Misplacing neurons in incorrect locations can distort the quantification of their relationships, leading to unreliable connectivity mappings. Third, our method may be influenced by uneven neuronal sampling, particularly due to potential biases introduced in the genetical or viral labeling processes of neurons. To address this, increasing neuronal sampling density by integrating multiple datasets could be an effective solution. In this study, we demonstrate the feasibility of such an approach by merging several independent datasets, highlighting the potential for effective data fusion in connectivity analysis.
Methods
Nomenclature and abbreviation of CCFv3 brain regions
Isocortex: frontal pole (FRP), somatomotor areas (MO), primary motor area (MOp), secondary motor area (MOs), somatosensory areas (SS), primary somatosensory area (SSp), primary somatosensory area, nose (SSp-n), primary somatosensory area, barrel field (SSp-bfd), primary somatosensory area, lower limb (SSp-ll), primary somatosensory area, mouth (SSp-m), primary somatosensory area, upper limb (SSp-ul), primary somatosensory area, trunk (SSp-tr), primary somatosensory area, unassigned (SSp-un), supplemental somatosensory area (SSs), gustatory areas (GU), visceral area (VISC), dorsal auditory area (AUDd), primary auditory area (AUDp), ventral auditory area (AUDv), anterolateral visual area (VISal), anteromedial visual area (VISam), lateral visual area (VISl), primary visual area (VISp), posteromedial visual area (VISpm), anterior cingulate area (ACA), anterior cingulate area, dorsal part (ACAd), anterior cingulate area, ventral part (ACAv), prelimbic area (PL), infralimbic area (ILA), orbital area, lateral part (ORBl), orbital area, medial part (ORBm), orbital area, ventrolateral part (ORBvl), agranular insular area, dorsal part (AId), agranular insular area, posterior part (AIp), agranular insular area, ventral part (AIv), retrosplenial area, lateral agranular part (RSPagl), retrosplenial area, dorsal part (RSPd), retrosplenial area, ventral part (RSPv), posterior parietal association areas (PTLp), anterior area (VIsa), rostrolateral visual area (VISrl), temporal association areas (TEa).
Olfactory areas (OLF): piriform area (PIR).
Hippocampal formation (HPF): field CA1 (CA1), field CA3 (CA3), dentate gyrus (DG), entorhinal area, lateral part (ENTl), entorhinal area, medial part, dorsal zone (ENTm), parasubiculum (PAR), postsubiculum (POST), presubiculum (PRE), subiculum (SUB), prosubiculum (ProS).
Cortical subplate (CTXsp): claustrum (CLA).
Striatum (STR): caudate putamen (CP), nucleus accumbens (ACB), olfactory tubercle (OT), lateral septal complex (LSX), lateral septal nucleus, rostral part (LSr), striatum-like amygdalar nuclei (sAMY).
Pallidum (PAL): substantia innominate (SI).
Thalamus (TH): ventral anterior-lateral complex of the thalamus (VAL), ventral medial nucleus of the thalamus (VM), ventral posterior complex of the thalamus (VP), ventral posterolateral nucleus of the thalamus (VPL), ventral posterolateral nucleus of the thalamus, parvicellular part (VPLpc), ventral posteromedial nucleus of the thalamus (VPM), medial geniculate complex (MG), dorsal part of the lateral geniculate complex (LGd), lateral posterior nucleus of the thalamus (LP), posterior complex of the thalamus (PO), anteroventral nucleus of thalamus (AV), anteromedial nucleus (AM), anterodorsal nucleus (AD), lateral dorsal nucleus of thalamus (LD), mediodorsal nucleus of thalamus (MD), submedial nucleus of the thalamus (SMT), nucleus of reuniens (RE), central lateral nucleus of the thalamus (CL), parafascicular nucleus (PF), reticular nucleus of the thalamus (RT).
Hypothalamus (HY): hypothalamic medial zone (MEZ), medial mammillary nucleus (MM), lateral hypothalamic area (LHA), zona incerta (ZI).
Midbrain (MB): substantia nigra, reticular part (SNr), midbrain reticular nucleus (MRN), superior colliculus, motor related (SCm), superior colliculus, motor related, intermediate gray layer (SCig), periaqueductal gray (PAG), anterior pretectal nucleus (APN).
Hindbrain (HB): pons (P), pontine gray (PG).
Medulla (MY): medulla, sensory related (MY-sen), medulla, motor related (MY-mot), intermediate reticular nucleus (IRN), parvicellular reticular nucleus (PARN).
Cerebellar cortex (CBX).
Cerebellar nuclei (CBN).
Full and axon reconstructions
We performed a detailed analysis of 1,741 fully reconstructed single-neuron morphologies3 called BICCN AIBS/SEU-ALLEN and 1,200 full single-neuron reconstructions from the Janelia MouseLight project4. We also analyzed the axonal morphology of 6,357 neurons generated by ION15. All neurons were registered to CCFv3 (ref. 23). These data are also documented in Extended Data Table 1. The naming convention of brain regions follows the CCFv3 and was consistent with the previous studies.
Generation of dendritic tracing
We generated 10,860 dendrite reconstructions from fluorescence micro-optical sectioning tomography (fMOST) imaging with the following protocol. First, we collected image samples following the same protocol in our previous study on generating the full reconstructions3. Next, we ran the APP2 algorithm51 for tracing local arbors by taking manually defined and validated somas as the central starting points in local image volumes (1,024 × 1,024 × 512 voxels), for the goal that the joint area of these local volumes covers main dendrite arbors. We ran APP2 with a number of background thresholds (10, 15, 20, 25, 30 and 35) resulting in 6 tracing candidates. Then, we leveraged the set of manually annotated and validated dendritic arbors (from MouseLight and BICCN AIBS/SEU-ALLEN) to filter the automatic tracing results. The [min, max] intervals of the following five features of the dendritic arbors were considered realistic, including ‘tips’ [7, 143], ‘length’ [700, 13615], ‘max path distance’ [108, 1382], ‘average bifurcation angle remote’ [35, 129] and ‘max branch order’ [3,32]. An automatic tracing would be discarded if fewer than four of its features fell out of these limits. If more than one tracing qualified for a soma location, we kept the tracing with the greatest length. We spatially registered all tracings to CCFv3 using mBrainAligner17 (https://github.com/Vaa3D/vaa3d_tools/commits/master/hackathon/mBrainAligner, commit number: b10132f). In total, we collected images from 53 mouse brains, we identified 31,625 neurons and we generated 17,228 qualified tracings. We visually inspected all tracings and discarded those with obvious errors (for example, one trace covers multiple touching neurons, or misalignment during registration), finally obtaining 10,860 proofread dendritic tracings.
Preprocessing of multisource reconstructions
To handle different analytic scenes, we maintained full reconstructions and separated the neuron into dendritic and axonal reconstructions. With the aim of ensuring a reasonable comparison and aggregation of neuron reconstructions from different sources, we resampled the dendrites, axons and full reconstructions with 4 μm, 10 μm and 10 μm, respectively.
Independent reconstructions for validation
To cross-validate the neuron morphologies used in this work, we also considered independent morphologies produced and documented in public resources. In particular, we searched adult mouse neuron reconstructions in certain brain regions via keywords using the searching tool of NeuroMorpho.Org (http://neuromorpho.org/KeywordSearch.jsp). When possible, we kept only the neuron reconstructions tagged as complete or, if those were not available, moderately complete. We searched neurons in four different brain regions (HPF, SS, MO and ACA) using keywords ‘{region}&dendrite&mouse&adult’ where {region} was one of the four acronyms. Details of data sources are listed in Supplementary Table 2 (refs. 52,53,54,55,56,57,58,59,60,61,62).
GMM classification of neuron nodes
We standardized full tracings of single neurons (n = 9,298) in the Stockley–Wheal–Cole format63 (SWC). In brief, an SWC file is a data file format to store the basic skeleton information of a reconstructed neuron, which consists of a series of reconstruction nodes. Each node is a geometrical object approximating a compartment in a neuron. This format has been widely used in the neuroinformatics field to exchange crucial information about the essential morphology of neurons. In the SWC format, rows indicate the positions of individually connected nodes forming a tree, and columns show information on the 3D coordinates, node IDs and parent IDs among others. We used the SWCs with 10-μm spacing between adjacent reconstruction nodes and saved their coordinates in the space of the CCFv3 (ref. 23) at 25 μm isotropic voxel resolution.
We determined 31 s-types for analysis by requiring each of them to have at least 60 reconstructed neurons. Starting from these 31 s-types in our full neuron tracing dataset, we maximized the numbers of neurons per region and coverage of the whole brain for domain definition by pooling together neurons from various cortical layers and subareas (for example, SSp-m, SSp-bfd and so on). This led to domain definition in 19 CCFv3 brain regions (MOs, AId, ACAd, ACAv, ORBvl, ORBl, ORBm, VPM, CP, AIv, FRP, ILA, MOp, SSp, VPL, SUB, LGd and SSs). In other words, the 19 regions are the anatomically merged version of the 31 regions.
We pooled all SWC coordinates in a single data frame containing their x, y and z locations. We clustered the pooled data using the Mclust64 function with default parameters (mclust R package version 5.4.7). We selected the GMM (among all combinations of spherical, diagonal and ellipsoidal with equal or varying volume, shape and orientation) as it provides optimal clustering as measured by the Bayesian information criterion65 (BIC). BIC is a measure for the comparative evaluation among a finite set of statistical models, based on maximizing the likelihood function while penalizing for the number of parameters in the models. We did not set any threshold for minimum BIC. We chose the GMM type and number of clusters (between 1 and 9) with maximal BIC in each brain region. We saved the resulting classification with the node IDs of each neuron.
Definition of arbor domains using α-shape
We used GMM clustering to group SWC node coordinates of full neuron reconstructions as detailed above. We defined 3D arbor domains by determining the minimum volume that could enclose all nodes of each cluster. For this purpose, we used the alphashape3d R package66 version 1.3.1. The 3D α-shape is a generalized definition derived from the Delaunay triangulation67. We controlled the level of detail in the triangulation by using a value of 0.4 as the parameter α. We categorized arbor domains containing the neurons’ respective somas as dendritic, and we used them in downstream connectivity analyses. Practically, the domains were also intersected with the CCFv3 brain regions approximated by the evenly sampled neuron reconstructions. We plotted two-dimensional sections of the arbor domains overlaid on the mouse brain atlas using the R base plot function (version 4.1.0).
The definition of dendritic domains based on full tracings was obtained both for raw data distributed in both brain hemispheres and for flipped neurons, ensuring that all of them had somas in the same hemisphere. To further analyze connectivity, in that case, dendritic domains were flipped to also recapitulate homologous contralateral regions.
In addition to the arbor domains obtained from fully traced neurons, we also generated single dendritic domains from using all node coordinates for dendritic tracings with somas inside each of the 19 brain regions with most neurons. Three-dimensional α-shapes were defined using the same method. In this case, all coordinates were pooled in a single set for each brain hemisphere. As a result, domains obtained from automated dendrite tracings recapitulate the region volumes defined in CCFv3.
Single-neuron connectivity to dendritic arbor domains
To define outgoing connections from single fully traced neurons to dendritic arbor domains, we measured the spatial overlap between single neurons and arbor domains. We obtained all voxels enclosed by each domain 3D α-shape using the inashape3d function in the alphashape3d R package (version 1.3.1) and saved the voxels as a 3D mask polygon in the CCFv3 space. To convert surface polygon file format.ply files to 3D masks, we used binvox68 version 1.35. We then obtained a 3D volume where each voxel contains an array of indices identifying each 3D α-shape volume covering such voxel. We obtained α-shapes for each individual fully traced neuron (α = 0.4) and saved the enclosed volume as a 3D mask. Finally, we measured the overlap volume between each single-neuron mask and the volume containing all 3D arbor domain indices. We saved the overlapping volume between each neuron and each dendritic domain as a connectivity matrix.
SVM clustering of morphology and connectivity
To assess the relevance of arbor domain connectivity for defining established cell types (s-types) and potential connectivity subtypes (c-types), we used a SVM (hyperoverlap R package version 1.1.1; linear kernel, cost = 1,000 and stoppage.threshold = 0.2) to classify neurons with somas located in MOp, SUB and VPL regions32,69. For each pair of brain regions, we used SVM to cluster the data in two groups. To assess the separation of the neurons in the space defined by the two morphological variables ‘total length’ and ‘maximum branch order’, we measured the pairwise overlap of points from each of the three brain regions. To account for arbor domain connectivity, we obtained a PCA from the connectivity matrix of the analyzed neurons. We performed pairwise SVM classification analogously by adding the first three principal components of the connectivity matrix in the dataset. We performed Wilcoxon tests with the rstatix R package (version 0.7.0). We plotted these results using the ggplot2 R package (version 3.4.0).
m/c-score metric
The m/c score can be used to quantify the dissimilarity of morphological and connectivity (arrays of spatial overlap between each single neuron and all dendritic arbor domains) features between two clusters, taking into account both their intraclass similarity and interclass separation. A higher score indicates a greater difference between two clusters, while a lower score indicates more similarity. The m/c score is calculated as
where \({{{\mathrm{Dist}}}}_{{\rm{interclass}}}\) represents the interclass distance between the centers of two clusters, which is calculated using the Manhattan distance metric70. \({{{\mathrm{Dist}}}}_{{{\mathrm{intraclass}}}(x)}\) represents the intraclass distance of cluster \(x\), which is defined as the average of the Manhattan distances between each sample and all other samples within the same cluster.
With regard to m-score matrix clustering, we applied hierarchical clustering by clustermap function (method = ‘ward’, metric = ‘euclidean’) in Seaborn Python package (version 0.11.2). We used umap-learn Python package (version 0.5.1) to implement UMAP decomposition with default parameters and plotted results as scatter plots with Matplotlib (version 3.3.4).
Anatomy-based distance metric
The distance metric follows the Mahalanobis definition71. Let \({s}_{i}=\left[{x}_{i},{y}_{i},{z}_{i}\right]\) be the position of soma \(i\) in 3D space. Due to the computational convenience, the soma location should be mirrored to the ipsilateral hemisphere. For two somas \({s}_{1}\) and \({s}_{2}\), anatomy-based distance was defined using
where \({{{\mathrm{Cov}}}}_{{{\mathrm{anatomy}}}}\) represents the covariance of 3D positions of voxels of the relevant ipsilateral anatomical region in the 25 μm CCFv3 reference space volume.
Distance-weighted connectivity-based clustering
Distance-weighted connectivity-based clustering was used to cluster s-type cells on the basis of both their connectivity feature similarity and physical distance of somas. Two matrices were generated to represent these components: a connectivity similarity matrix (c-similarity matrix denoted by \({M}_{{\mathrm{C}}}\)) calculated using cosine similarity, and a distance matrix (d-map denoted by \({M}_{{\mathrm{D}}}\)) calculated on the basis of the anatomy-based distance between the somas of the cells. Both matrices were linearly normalized to values between 0 and 1. To emphasize spatial adjacency, a distance affinity matrix (\({M}_{{{\mathrm{DA}}}}\)) was constructed using Gaussian kernel; \(\,{M}_{{{\mathrm{DA}}}}=\exp \left(-{M}_{{\mathrm{D}}}\times {M}_{{\mathrm{D}}}\right)\). This ensured that larger values in the affinity matrix indicated greater spatial adjacency between cells.
Hierarchical clustering was subsequently applied on the matrix resulting from multiplying \({M}_{{\mathrm{C}}}\) and \({M}_{{{\mathrm{DA}}}}\) to produce clustering results. The optimal number of clusters was determined by the Silhouette Score72 (metrics.silhouette_score function from scikit-learn Python package version 0.24.2) automatically.
Neuron-beta
In the delineation of distinct neuron cohorts, characterized by specific cerebral regions and/or cortical layers, we computed the mean of mesoscale projections denoted as \(M=[{m}_{1},\,{m}_{2},\ldots ,{m}_{p}]\), where p represents the count of brain areas. For individual neuron projections, denoted as \(S=[{s}_{1},\,{s}_{2},\ldots ,{s}_{p}]\), we defined the neuron-beta value as follows:
Correlation between single-cell and population morphology, projections and transcriptomics
We used a transcriptomic dataset of 34,331 neurons in MOs and FRP brain regions, which was collected from a newly released dataset38. The analysis was performed by SCANPY (a Python package, version 1.9.3). The original data were composed of 31,053 genes. To ensure data quality, we performed quality control including filtering out 10,352 genes that were detected in fewer than 3 cells (20,701 genes left) and filtering out 922 cells that expressed over 6,000 genes (33,409 cells left), where the threshold was determined on the basis of the distribution of the number of expressed genes (Supplementary Fig. 15). Following the same procedure, we finally used 9,255 cells and 21,621 genes for analyses of MOs_FRP_L5 and 6,129 cells and 20,920 genes for MOs_FRP_L2/3. We normalized the data (using functions pp.normalize_total, pp.log1p and pp. regress_out, under default parameters) and reduced their dimension (using tl.pca and tl.umap, under default parameters) for visualization.
Modularity calculation in joint analysis
For the UMAP analysis in Fig. 6e and Extended Data Fig. 4, there are four steps. (1) Dimensionality reduction: apply PCA with a variance retention threshold of 99% on normalized connectivity or morphological features to reduce dimensionality. (2) Graph construction: create a k-nearest neighbors graph using the Euclidean distance metric based on the reduced data, and weight the k-nearest neighbors graph with the Jaccard distance. (3) Louvain clustering with different resolution parameters: iteratively create instances of the Louvain clustering73 over a range of resolution parameters of \({1.01}^{n}\) (\(n\in \left[-\text{1,000},\,\text{1,000}\right]\), step size of 1). (4) Evaluate and filter clustering results: exclude results with fewer than two or more than ten clusters, and compute the modularity score for each clustering solution.
Electrophysiological data analysis
For electrophysiological modality, we selected 919 cells in VISp layers from a quality-controlled Path-seq dataset7. The selected dataset had five transcriptomic labels (Pvalb Reln Itm2a, Sst Hpse Cbln4, Sst Calb2 Pdlim5, Lamp5 Lsp1 and Pvalb Sema3e Kank4), and five structure labels (VISp1, VISp2/3, VISp4, VISp5 and VISp6a). UMAP layout of the dataset showed three distinct populations. For 919 cells in electrophysiological profile, we used IPFX (a Python package, version 1.0.7) for the feature extraction, generating 13 electrophysiological features (Supplementary Table 5) for each cell. We concatenated these features as one vector profiling each cell in subsequent analyses.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The local reconstructions of dendrites (DEN-SEU) and key computational materials are available via Zenodo at https://doi.org/10.5281/zenodo.14242860 (ref. 74). The Allen Mouse Brain Common Coordinate Framework can be accessed via the Allen Reference Atlas at https://mouse.brain-map.org/. Neuron full reconstructions from MouseLight data are available at https://ml-neuronbrowser.janelia.org/. Axonal reconstructions from ION data75 can be accessed at https://doi.org/10.12412/BSDC.1667282400.20002. The single-cell transcriptomes data used in this study can be downloaded from https://portal.brain-map.org/atlases-and-data/rnaseq/mouse-whole-cortex-and-hippocampus-10x. Electrophysiological data of VISp neurons utilized in this study can be accessed at https://portal.brain-map.org/cell-types/classes/multimodal-characterization.
Code availability
The codes and scripts used for the analyses and figure plotting in this study are publicly available under the MIT license via GitHub at https://github.com/YunZhixi98/Connectivity_Type. Vaa3D (version 3.601) can be accessed via GitHub at https://github.com/Vaa3D. The mBrainAligner tool is also available via GitHub at https://github.com/Vaa3D/vaa3d_tools/tree/master/hackathon/mBrainAligner.
References
Purves, D. et al. Neurosciences (De Boeck Supérieur, 2019).
Luo, L. Principles of Neurobiology (Garland Science, 2015).
Peng, H. et al. Morphological diversity of single neurons in molecularly defined cell types. Nature 598, 174–181 (2021).
Winnubst, J. et al. Reconstruction of 1,000 projection neurons reveals new cell types and organization of long-range connectivity in the mouse brain. Cell 179, 268–281 (2019).
Zeng, H. & Sanes, J. R. Neuronal cell-type classification: challenges, opportunities and the path forward. Nat. Rev. Neurosci. 18, 530–546 (2017).
Muñoz-Castañeda, R. et al. Cellular anatomy of the mouse primary motor cortex. Nature 598, 159–166 (2021).
Gouwens, N. W. et al. Integrated morphoelectric and transcriptomic classification of cortical GABAergic cells. Cell 183, 935–953 (2020).
Moffitt, J. R. et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 362, eaau5324 (2018).
Zhang, M. et al. Molecularly defined and spatially resolved cell atlas of the whole mouse brain. Nature 624, 343–354 (2023).
Lipovsek, M. et al. Patch-seq: past, present, and future. J. Neurosci. 41, 937–946 (2021).
Kalmbach, B. E. et al. Signature morpho-electric, transcriptomic, and dendritic properties of human layer 5 neocortical pyramidal neurons. Neuron 109, 2914–2927 (2021).
y Cajal, S. R. Histologie du système nerveux de l’homme & des vertébrés: Cervelet, cerveau moyen, rétine, couche optique, corps strié, écorce cérébrale générale & régionale, grand sympathique vol. 2 (A. Maloine, 1911).
Peng, H. et al. BigNeuron: large-scale 3D neuron reconstruction from optical microscopy images. Neuron 87, 252–256 (2015).
Manubens-Gil, L. et al. BigNeuron: a resource to benchmark and predict best-performing algorithms for automated reconstruction of neuronal morphology. Nat. Methods https://doi.org/10.1038/s41592-023-01848-5 (2023).
Gao, L. et al. Single-neuron projectome of mouse prefrontal cortex. Nat. Neurosci. 25, 515–529 (2022).
Han, X. et al. Whole human-brain mapping of single cortical neurons for profiling morphological diversity and stereotypy. Sci. Adv. 9, eadf3771 (2023).
Qu, L. et al. Cross-modal coherent registration of whole mouse brains. Nat. Methods 19, 111–118 (2022).
Abbott, L. F. et al. The mind of a mouse. Cell 182, 1372–1376 (2020).
Axer, M. & Amunts, K. Scale matters: the nested human connectome. Science 378, 500–504 (2022).
Turner, N. L. et al. Reconstruction of neocortex: organelles, compartments, cells, circuits, and activity. Cell 185, 1082–1100 (2022).
Dorkenwald, S. et al. Binary and analog variation of synapses between cortical pyramidal neurons. eLife 11, e76120 (2022).
Scheffer, L. K. et al. A connectome and analysis of the adult Drosophila central brain. eLife 9, e57443 (2020).
Wang, Q. et al. The Allen mouse brain common coordinate framework: a 3D reference atlas. Cell 181, 936–953 (2020).
Akram, M. A., Nanda, S., Maraver, P., Armañanzas, R. & Ascoli, G. A. An open repository for single-cell reconstructions of the brain forest. Sci. Data 5, 180006 (2018).
Bijari, K., Akram, M. A. & Ascoli, G. A. An open-source framework for neuroscience metadata management applied to digital reconstructions of neuronal morphology. Brain Informatics 7, 2 (2020).
Binley, K. E., Ng, W. S., Tribble, J. R., Song, B. & Morgan, J. E. Sholl analysis: a quantitative comparison of semi-automated methods. J. Neurosci. Methods 225, 65–70 (2014).
Scorcioni, R., Polavaram, S. & Ascoli, G. A. L-Measure: a web-accessible tool for the analysis, comparison and search of digital reconstructions of neuronal morphologies. Nat. Protocols 3, 866–876 (2008).
Wan, Y. et al. BlastNeuron for automated comparison, retrieval and clustering of 3D neuron morphologies. Neuroinformatics 13, 487–499 (2015).
Rees, C. L., Moradi, K. & Ascoli, G. A. Weighing the evidence in Peters’ rule: does neuronal morphology predict connectivity? Trends Neurosci. 40, 63–71 (2017).
Paxinos, G. & Franklin, K. B. Paxinos and Franklin’s the Mouse Brain in Stereotaxic Coordinates (Academic Press, 2019).
Dong, H. W. The Allen Reference Atlas: A Digital Color Brain Atlas of the C57Bl/6J Male Mouse (Wiley, 2008).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Steinwart, I. & Christmann, A. Support Vector Machines (Springer, 2008).
Guido, W. Development, form, and function of the mouse visual thalamus. J. Neurophysiol. 120, 211–225 (2018).
Okigawa, S. et al. Cell type‐and layer‐specific convergence in core and shell neurons of the dorsal lateral geniculate nucleus. J. Comp. Neurol. 529, 2099–2124 (2021).
Yang, J. H. & Kwan, A. C. Secondary motor cortex: broadcasting and biasing animal’s decisions through long-range circuits. Int. Rev. Neurobiol. 158, 443–470 (2021).
Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207–214 (2014).
Yao, Z. et al. A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation. Cell 184, 3222–3241 (2021).
Bureau, I., von Saint Paul, F. & Svoboda, K. Interdigitated paralemniscal and lemniscal pathways in the mouse barrel cortex. PLoS Biol. 4, e382 (2006).
Harris, J. A. et al. Hierarchical organization of cortical and thalamic connectivity. Nature 575, 195–202 (2019).
Viaene, A. N., Petrof, I. & Sherman, S. M. Synaptic properties of thalamic input to layers 2/3 and 4 of primary somatosensory and auditory cortices. J. Neurophysiol. 105, 279–292 (2011).
Clascá, F., Rubio‐Garrido, P. & Jabaudon, D. Unveiling the diversity of thalamocortical neuron subtypes. Eur. J. Neurosci. 35, 1524–1532 (2012).
Staiger, J. F. & Petersen, C. C. Neuronal circuits in barrel cortex for whisker sensory perception. Physiol. Rev. 101, 353–415 (2021).
Pierret, T., Lavallée, P. & Deschênes, M. Parallel streams for the relay of vibrissal information through thalamic barreloids. J. Neurosci. 20, 7455–7462 (2000).
Kuramoto, E. et al. Complementary distribution of glutamatergic cerebellar and GABAergic basal ganglia afferents to the rat motor thalamic nuclei. Eur. J. Neurosci. 33, 95–109 (2011).
Guo, K., Yamawaki, N., Svoboda, K. & Shepherd, G. M. Anterolateral motor cortex connects with a medial subdivision of ventromedial thalamus through cell type-specific circuits, forming an excitatory thalamo-cortico-thalamic loop via layer 1 apical tuft dendrites of layer 5B pyramidal tract type neurons. J. Neurosci. 38, 8787–8797 (2018).
Guo, K., Yamawaki, N., Barrett, J. M., Tapies, M. & Shepherd, G. M. Cortico-thalamo-cortical circuits of mouse forelimb S1 are organized primarily as recurrent loops. J. Neurosci. 40, 2849–2858 (2020).
Mehta, K., Goldin, R. F. & Ascoli, G. A. Circuit analysis of the Drosophila brain using connectivity-based neuronal classification reveals organization of key communication pathways. Netw. Neurosci. 7, 269–298 (2023).
Liu, L. & Qian, P. Manifold classification of neuron types from microscopic images. Bioinformatics 38, 4987–4989 (2022).
Stepanyants, A., Hof, P. R. & Chklovskii, D. B. Geometry and structural plasticity of synaptic connectivity. Neuron 34, 275–288 (2002).
Xiao, H. & Peng, H. APP2: automatic tracing of 3D neuron morphology based on hierarchical pruning of a gray-weighted image distance-tree. Bioinformatics 29, 1448–1454 (2013).
Yamashita, T. et al. Diverse long-range axonal projections of excitatory layer 2/3 neurons in mouse barrel cortex. Front. Neuroanat. 12, 33 (2018).
Gong, H. et al. High-throughput dual-colour precision imaging for brain-wide connectome with cytoarchitectonic landmarks at the cellular level. Nat. Commun. 7, 12142 (2016).
Iascone, D. M. et al. Whole-neuron synaptic mapping reveals spatially precise excitatory/inhibitory balance limiting dendritic and somatic spiking. Neuron 106, 566–578 (2020).
Cohen, L. et al. Time-lapse electrical recordings of single neurons from the mouse neocortex. Proc. Natl Acad. Sci. USA 110, 5665–5670 (2013).
Smit-Rigter, L. A. et al. Prenatal fluoxetine exposure induces life-long serotonin 5-HT3 receptor-dependent cortical abnormalities and anxiety-like behaviour. Neuropharmacology 62, 865–870 (2012).
Suter, B. A. & Shepherd, G. M. Reciprocal interareal connections to corticospinal neurons in mouse M1 and S2. J. Neurosci. 35, 2959–2974 (2015).
Jiang, S. et al. Anatomically revealed morphological patterns of pyramidal neurons in layer 5 of the motor cortex. Sci. Rep. 10, 7916 (2020).
Lin, H. M. et al. Reconstruction of intratelencephalic neurons in the mouse secondary motor cortex reveals the diverse projection patterns of single neurons. Front. Neuroanat. 12, 86 (2018).
Morelli, E. et al. Environmental enrichment restores CA1 hippocampal LTP and reduces severity of seizures in epileptic mice. Exp. Neurol. 261, 320–327 (2014).
Murase, S. et al. Matrix metalloproteinase-9 regulates neuronal circuit development and excitability. Mol. Neurobiol. 53, 3477–3493 (2016).
Karlsson, T. E. et al. NgR1: a tunable sensor regulating memory formation, synaptic, and dendritic plasticity. Cerebr. Cortex 26, 1804–1817 (2016).
Stockley, E. W., Cole, H. M., Brown, A. D. & Wheal, H. V. A system for quantitative morphological measurement and electrotonic modelling of neurons: three-dimensional reconstruction. J. Neurosci. Methods 47, 39–51 (1993).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 8, 289 (2016).
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978).
Edelsbrunner, H. & Mücke, E. P. Three-dimensional alpha shapes. ACM Trans. Graph. 13, 43–72 (1994).
Lee, D. T. & Schachter, B. J. Two algorithms for constructing a Delaunay triangulation. Int. J. Comput. Inf. Sci. 9, 219–242 (1980).
Nooruddin, F. S. & Turk, G. Simplification and repair of polygonal models using volumetric techniques. IEEE Trans. Vis. Comput. Graph. 9, 191–205 (2003).
Brown, M. J., Holland, B. R. & Jordan, G. J. hyperoverlap: detecting biological overlap in n‐dimensional space. Methods Ecol. Evol. 11, 513–523 (2020).
Han, J., Pei, J. & Tong, H. Data Mining: Concepts and Techniques (Morgan Kaufmann, 2022).
McLachlan, G. J. Mahalanobis distance. Resonance 4, 20–26 (1999).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech.Theory Exp. 2008, P10008 (2008).
Yun, Z., Manubens-Gil, L., Liu, L. & Peng, H. Resources for anatomical connectivity and single-neuron typing/subtyping [data set]. Zenodo https://doi.org/10.5281/zenodo.14242860 (2024).
Wang, X. Single-neuron datasets for mouse prefrontal cortex. Brain Science Data Center, Chinese Academy of Sciences https://cstr.cn/33145.11.BSDC.1667284051.1585453592638009346 (2022).
Acknowledgements
This project was mainly supported by several grants awarded to H.P. H.P. is a New Cornerstone Investigator and a SANS Senior Investigator. The Southeast University team was also supported by a MOST (China) Brain Research Project, ‘Mammalian Whole Brain Mesoscopic Stereotaxic 3D Atlas’ (project numbers 2022ZD0205200 and 2022ZD0205204) awarded to L.L. G.A.A. was supported in part by NIH grants R01NS39600 and RF1MH128693. H.Z. was supported by BRAIN Initiative grant U19MH114830. We thank S. Seung for comments and suggestions on the manuscript and literature, Y. Yao, R. Yuste, D. Van Essen and a number of other experts for various inspirations including discussions, comments, suggestions and community events.
Author information
Authors and Affiliations
Contributions
H.P. conceptualized and designed this study, and managed the entire project. L.L. executed the project and supervised Z.Y. and F.X. in generating all results along with the assistance of L.M.-G. and H.C. H.D., M.H. and H.Z. advised on the project and manuscript. G.A.A. advised on selection and analysis of independent reconstructions for validation, participated in critical discussion on the early phases of analysis design and assisted in editing and revising the manuscript, focusing specifically on the relation with extant scientific literature. H.P. wrote the manuscript with input from all coauthors.
Corresponding author
Ethics declarations
Competing interests
H.C. was employed by Tencent AI Lab when this work was done. The company did not influence the research. All other authors have no competing interests.
Peer review
Peer review information
Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Nina Vogt, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Conceptional illustration of overall workflow.
Data Generation: DEN-SEU dataset was generated in 5 steps. Dataset: 4 state-of-the-art single neuron reconstruction datasets were aggregated for analysis. Visualizations of neuron morphologies in CCFv3 were displayed on the right. Potential Connectivity: Definition of potential connectivity between single axon tree and dendritic domains. Exploration and Validation: downstream analysis of c-types includes the capability of neuron classification and subtyping, multi-modality analysis.
Extended Data Fig. 3 Spatially tuned clustering of all neurons of different s-types in addition to those in Fig. 5.
Tri-view visualization of neurons in CCFv3. The colors of the neurons are determined by the results of spatial tuned connectivity clustering.
Extended Data Fig. 4 UMAP analysis and clustering modularity of different s-type layers between different data modalities.
a. UMAP visualization of MOs_FRP layer 2/3 about transcriptomic, connectivity and morphological features. b. Modularity of various clustering results derived from the Louvain algorithm. Error bars indicate standard deviation. The height of the bars represents the average modularity across multiple experiments. The number of technical replicates (n) is indicated below each bar in the plot.
Extended Data Fig. 5 Comparative analysis of projection patterns of VISp neurons originated in various cortical layers (2/3, 4, 5, and 6; n = 16, 11, 11, and 9, respectively), the respective morphological features and soma locations, and public-domain electrophysiological recording and transcriptomic profiles of single neurons.
a. Projection and regional connectivity patterns of VISp neurons, grouped by soma-locations in four cortical layers. b. Comparison of axon features of VISp neurons in four layers. The number of samples involved is consistent with the previous statement. The elements of the box plots are consistent with those in Fig. 4l. c. Locations of VISp neurons used in this study. d. Joint distribution of morphological features and soma locations in the respective UMAP space. e. Joint distribution of connectivity features and soma locations in the respective UMAP space. f. Joint distribution of electrophysiological features and molecular types of VISp neurons in the respective UMAP space. g. Joint distribution of electrophysiological features and soma locations in the respective UMAP space.
Extended Data Fig. 6 Alternative examination of connection types in ventral posterior (VP) nucleus.
a. 8 different VP neuron subtypes clustered and color-coded by projecting target regions, particularly cortical layers. Columns: individual neurons. Rows: projection targets of neurons. Color bar: axonal length of a neuron projecting to a specific area. The number of neurons in each cluster is as follows: Cluster 1: 67, Cluster 2: 146, Cluster 3: 13, Cluster 4: 95, Cluster 5: 45, Cluster 6: 13, Cluster 7: 22, Cluster 8: 45. b. Axonal clusters of these VP subtypes mapped to CCF. c. Soma positions and connected dendrites of the 8 subtypes. d. Analysis of dendritic total length (μm) of these 8 clusters of neurons. The number of neuron samples is the same as stated in the legend for Panel A. The elements of the box plots are consistent with those in Fig. 4l. e. Examples of VP neurons with zoomed-in coronal view of dendrites. Red arrows: projection targets outside of somatosensory areas; note the VIS target is in layer 1. f. Examples of cluster size located in the barrel field. Note the right cluster covers more than one barrel. g. Visualized CP projections of VPM neurons.
Extended Data Fig. 7 Conjugated MOp projections of individual thalamocortical neurons from PO, VM, VAL and VP nuclei.
a. Co-projection of axonal arbors to MOp and nearby cortical areas originated from PO, VAL, VP, and VM. Color code: normalized arbor density. b. Projection matrices of individual neurons in a. Columns: individual neurons. Rows: projection targets, particularly with MOp layers. c. Example-neurons from each nucleus. Circular plots: distribution of target projection regions for each neuron.
Extended Data Fig. 8 Summary of major findings, data, and techniques in this study.
Overview of the concepts and findings. For comprehensive details, please refer to the main text and the Extended Data Figures, Tables, and other supporting data.
Supplementary information
Supplementary Information
Supplementary Figs. 1–18.
Supplementary Table 1
Literature cross-validation of connectivity-based cell subclusters.
Supplementary Table 2
Information of public datasets downloaded from https://neuromorpho.org/.
Supplementary Table 3
Acronyms of brain regions used for incoming and outgoing degree computation in Fig. 1c.
Supplementary Table 4
Names of dendritic arbor domains.
Supplementary Table 5
Names of features used in electrophysiological analysis.
Supplementary Table 6
MOs_FRP cells with the strongest expressed top 100 genes within each cell type.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liu, L., Yun, Z., Manubens-Gil, L. et al. Connectivity of single neurons classifies cell subtypes in mouse brains. Nat Methods 22, 861–873 (2025). https://doi.org/10.1038/s41592-025-02621-6
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-025-02621-6