Extended Data Fig. 5: Robust clustering of nerve-ring processes from M4 spatial population models.
From: A multi-scale brain map derived from whole-brain volumetric reconstructions

The variability of membrane contacts (Fig. 2, Extended Data Fig. 2) suggests that no single contactome is representative of the population. We estimated the variability among membrane contact areas. a, The log-normalized empirical distribution of M4 membrane contact areas (mean centred at 0; STD, standard deviation; red line, normal distribution with empirical mean and standard deviation; n = 1,258 membrane contacts). We estimated the variability across the four datasets (L4 left, L4 right, adult left and adult right). For each conserved M4 contact, we computed the mean and standard deviation of the membrane contact area across the four datasets (see Methods). b, Plot of the standard deviation versus the mean contact area across the datasets, where each point is one M4 contact. Similar to Extended Data Fig. 1a, we find no dependence of the variability on membrane contact area. Therefore, we estimate membrane contact area variability by the mean variability among all membrane contact areas. c, The distribution of standard deviations of membrane contact area for all M4 contacts. Red dashed line indicates mean standard deviation. d–i, A stochastic spatial population model matches the above distributions by randomly perturbing membrane contact areas in the four datasets with multiplicative white noise with standard deviation (σ) of 0.23 (Methods). d–f, Spatial population data perturb the membrane contact areas while maintaining contact area and variability distributions that are similar to the empirical M4 contact area distributions. g, Perturbed contact areas scale linearly with the empirical contact areas. h, The spread of perturbed contact areas (log of the perturbed contact area as a fraction of the empirical contact area) is mostly uniform across membrane contact areas. i–l, Neurite clusters obtained from a population of 1,000 \(\tilde{{{\mathbb{M}}}^{4}}\) perturbed individuals and 1,000 \(\tilde{{\rm{L}}4}\) and \(\tilde{{\rm{Adult}}}\) perturbed individuals (perturbing left–right conserved contacts in the L4 and adult contact sets). For each perturbed individual in each population we used a multi-level graph-clustering algorithm to identify spatial clusters. Across each population, we computed the frequency that cell pairs cluster together, represented as an n × n cluster-frequency matrix (n = 93). A hierarchical clustering algorithm is used to sort the rows and columns of the cluster-frequency matrix to minimize variation along the diagonal. Hence, cell pairs that frequently cluster together are sorted together on the cluster-frequency matrix (Methods). Five largely overlapping subgroups of neurons emerge across different perturbations (see Fig. 1 and ‘Cluster assignment and validation’). i, Consensus clusters are robust across contact sets. \(\tilde{{\rm{L}}4}\) and \(\tilde{{\rm{Adult}}}\) clusters visualized using row and column colours of the \(\tilde{{{\mathbb{M}}}^{4}}\) population cluster assignments (dashed box). j, The consensus clusters are robust across different noise amplitudes. Clustering applied to populations generated by perturbations to M4 using white noise with standard deviations 0 (empirical data), 0.12, 0.45 and 0.9. k, l, The consensus clusters are robust across different spatial domains. k, Clustering applied to \(\tilde{{{\mathbb{M}}}^{4}}\) populations generated from a more spatially restricted volume of the neuropil34, which excludes its posterior lobe. l, Clustering applied to populations generated by perturbations to all reproducible membrane contact areas after restoring the smallest 35% contact areas to each of the L4, adult and M4 datasets (Extended Data Fig. 2). For all cluster-frequency matrices, matrix element (i, j) corresponds to the frequency that cells i and j cluster together across the 1,000 perturbed individuals. Row and column orders minimize variance along the diagonal (Methods). Cell cluster assignments (colour) follow the perturbed \(\tilde{{{\mathbb{M}}}^{4}}\) dataset (Fig. 1b reproduced in dashed box). Top, dendrogram of the hierarchical clustering.