Introduction

Spatial omics (SO) have revolutionized molecular mappings by enabling transcriptomic, proteomic, and metabolomic analyses within tissues1,2,3,4,5,6,7,8,9,10,11. The immense throughput has provided transcriptome- and proteome-wide data across assorted tissues, including the spinal cord12,13,14,15 and mouse brain, each with unique physiological properties16,17,18,19,20,21. Yet, they fall short when it comes to leveraging image-derived features that capture the nuanced spatial organization of tissues, due to challenges both in aligning molecular data with histological context and in extracting high-dimensional spatial patterns from tissue images. Addressing how molecular identities vary across tissue structures requires precision mapping and the integration of information across various scales: the cellular, regional, and whole tissue22,23.

Most applications of SO are molecule-driven, i.e. the spatial distributions of sets of transcripts are used to make distinctions24,25. These ‘bottom-up' approaches often employ clustering algorithms that learn from the variability in local and global gene expression patterns26,27,28,29,30,31,32,33. GASTON models the transcriptomic space as a topographic map34 and partitions tissues using piece-wise functions. BANKSY35 and NiCo36 consider transcriptomic features between neighboring cells to cluster in a spatially informed manner. Baysor37 defines cells by grouping key transcripts using Markov random fields and can incorporate spatial information to generate more precise clusters. Approaches like SpatialGlue38 and BrainAlign39 integrate multi-omics data and even enable cross-species regional alignment. These approaches continue to offer valuable insights, but precise expression-based whole-tissue segmentation can be enhanced by utilizing well-defined spatial markers from microscopy data at multiple scales37,38,39.

Top-Down approaches determine regional information such as anatomical structures, pathologies from image features, or by alignment/registration. This methodology allows researchers to ask questions about molecular changes in a visually defined region and to cross-validate bottom-up transcriptomic analyses. Existing tools use interpretable algorithms based on local pixel values40,41 or registration42,43. Recent methods leverage deep learning, including autoencoders44, convolutional neural networks45,46, transformers47,48,49,50, and combinations51,52,53 to segment and label relevant cells or tissue structures in immunofluorescence and histology datasets. Much focus is placed on developing methods to determine training techniques that maximize the potential of these models’ complex processing abilities54,55. These tools rely on either human labeling or computational labeling. Human-labeled data enhances interpretability but increases the time to produce training sets, can introduce errors/biases, lacks widely available data56, and may hinder novel discoveries57. Computational techniques enable nuanced embeddings but remove human oversight, degrading biological interpretability and noise sensitivity58,59. Despite this problem, there is a growing desire for tools that perform non-manual segmentation of biological data60. DeepSlice45 and FTUNet48 handle macro-scale atlas mapping and lesion segmentation, iStar47 and UniFMiR61 can enhance micro-scale data, and meso-scale analyses bridge tissue structures and the cells defining them62,63.

In this manuscript, we present SiDoLa-NS, a tool for tissue segmentation across micro-, meso-, and macro-scales of cell and tissue organization. By defining regional identity through image features alone, SiDoLa-NS can avoid biases from global gene expression and double dipping (i.e. defining and evaluating a region with the same gene set). We validate SiDoLa-NS with internal and publicly available brain, spinal cord, and peripheral nerve images and show the versatility of our tool across various platforms, including 10X Xenium, Visium SD/HD, H&E, and immunofluorescence sections. At the micro-scale, SiDoLa-NS performs cell and nuclei segmentation; at the meso-scale, it allows within- and between-fascicle analyses in the sciatic nerve; at the macro-scale, SiDoLa-NS defines brain regions based on the Allen Brain and Spinal Atlases64,65,66,67,68,69,70,71. SiDoLa-NS automatically annotates nervous system tissue and gives researchers ways to group the data across the scales and ask unbiased molecular questions: How do gene expression patterns of specific cell types vary across brain regions? What structural changes underlie molecular disruptions in disease? How do meso-scale features, such as clusters or bands of cells, relate to macro-scale organization in the nervous system?

But how can these models be trained without the burden of extensive manual annotations? By simulating training images which mimic tissue structure at each level of interest for identification of nuclei and cell boundaries, intermediate features (like clusters or bands of cells), and broader atlas-defined regions. This is distinct from diffusion and GAN-based approaches, which take, as input, real microscopic images for synthetic data generation. Our in-silico image simulations offer 1) full control over feature distributions, 2) imperfection and noise representation (yielding model robustness), and 3) access to a virtually infinite and annotator-error-free training set. This results in Omics datasets that are equally rich in quantifiable spatial and molecular data.

Results

Overview of SiDoLa-NS

We present SiDoLa-NS (Simulate, Don’t Label – Nervous System), a suite of tools designed for detection and classification of image features across a spectrum of nervous system tissues. SiDoLa-NS (pronounced See Doh La, think Do-Re-Mi) employs an image-based top-down approach, assigning labels to biologically defined regions of an image (Fig. 1A). SiDoLa-NS models are trained on fully synthetic/simulated images reflecting different staining modalities like immunohistochemistry (IHC) and immunofluorescence (IF), not by generative AI (Fig. 1B). Our approach generates high-resolution datasets and models imperfections in tissue samples72. For example, images from biological samples may contain 1) partially out-of-field, 2) broken, 3) poor focus/contrast/resolution, and 4) distorted or warped tissue sections. By simulating and representing these challenging sections, SiDoLa-NS can succeed where other models may encounter difficulties.

Fig. 1: Simulate don’t label: SiDoLa-NS for top-down segmentation of regions, clusters, and cells in neuronal tissue.
figure 1

A Schematic illustrating the distinction between bottom-up and top-down methods. Top-Down approaches capture distinct image-based features and representations of a tissue. Bottom-Up approaches consider omics data (such as gene expression) to build molecular definitions of regions and cell types. B Simulations create a massive synthetic dataset by stringing together biophysical nodes that build Cells and Regions of the image and then transform them from 3D geometry into images with an optics engine. Synthetic micrographs are shown below the nodes, giving examples of variation and features for a mouse coronal brain slice. C A mouse coronal brain Nissl section from the Allen Brain Atlas, shown as the “original slice”, with cartoons below and the concept of multiple spatial scales where segmentation identifies individual instances of a variety of features. Specifically, this approach identifies atlas-based regions at a macroscopic scale, as well as cell density and other meso-scale patterning in the tissue and is also able to segment cells to incorporate subcellular micro-scale information, like cell type and structure. D SiDoLa-NS was tested on three nervous system tissue types including mouse brain and spinal cord, as well as porcine sciatic nerve (a human-scale nerve cross section).

SiDoLa-NS can also perform multi-scale data integration. This is useful for tissue such as the brain, where it is essential to understand the spatial extents of regions to find clusters, lamina, and striations of cells, while detecting nuclei (Fig. 1C). Here, we explore SiDoLa-NS on three publicly available datasets from mouse and porcine neuronal tissues, including the brain, spinal cord, and sciatic nerve across various imaging modalities (Fig. 1D). SiDoLa-NS is designed for a wide range of platforms, from classical IHC and IF tissue sections, to the full range of SO technologies. These can be explored using the web application https://sidolans01.mgifive.org/.

SiDoLa-NS is trained on biophysical simulations of nuclei in brain slices to segment a histological section of mouse brain

We began with multi-scale segmentation of a Visium HD mouse brain coronal slice. We generated a fully synthetic image set (with 4353 pairs and 1.96 x 106 objects) to mimic DAPI stained nuclei in neurons, oligodendrocytes, and other cells (Fig. 2A 1). SiDoLa-NS-Micro-CNS, an instance segmentation CNN, was trained to find nuclei using this dataset (Methods). We tested SiDoLa-NS-Micro-CNS on a publicly available hematoxylin and eosin (H&E) coronal cross section of mouse brain from 10X Genomics’ Visium HD platform. Varied morphology was observed in a single channel of the H&E staining (Fig. 2A 2). SiDoLa-NS-Micro-CNS successfully segments nuclei at various DAPI intensities (Fig. 2A 3). At the “meso-scale”, we observed two distinct cellular populations: high density ‘striated’ regions and diffuse ‘laminar’ regions (Fig. 2B 1). Striated regions included granular cell zones like Dentate Gyrus and Hippocampal CA regions (Fig. 2B 2, 3) in the Visium HD brain slice73. Model performance was measured with the synthetic validation dataset. SiDoLa-NS-Micro-CNS achieved a mean Average Precision for Intersection over Union (IoU) greater than 0.5 (mAP50) of 0.715 (Fig. S1).

Fig. 2: Multi scale segmentation and classification of a mouse coronal brain section.
figure 2

A (1) At the micro-scale, SiDoLa-NS trained exclusively on simulated data featuring multiple cell types. (2) Micrographs of real brain images were used to run SiDoLa-NS-Micro-CNS at this scale and (3) identified nuclei (light and dark blue bounding boxes) from background. B At the meso-scale, simulated images (1) featuring bands of higher density cells. (2) Micrographs from brain slice, notice density patterns. (3) SiDoLa-NS classes overlaid, which differentiated high density striated regions (cyan) and adjacent cells (blue). C (1) At the macro-scale, SiDoLa-NS was trained on simulated images featuring full or hemi coronal sections of brain tissue. (2) SiDoLa-NS-Macro-mCB segmented real images of an H&E coronal brain slice, (3) highlighting biologically defined regions. D Comparison of Top-Down and Bottom-Up approaches side-by-side. The left-brain image and lower UMAP are SiDoLa-NS top-down, while the right-brain diagram and upper UMAP are bottom-up. The UMAP was constructed from transcriptomic data, and its colors map to Seurat clusters in the upper UMAP, and to SiDoLa-NS labeled brain regions in the lower.

To test generalizability, we employed SiDoLa-NS-Micro-CNS on publicly available DAPI-stained images of lung adenocarcinoma (10X Genomics Visium HD). Without parameter adjustment, we observed excellent segmentation and detection of nuclei, even in high-density regions of this tissue (Fig. S2). For the tissues examined here, existing nuclei-segmentation methods are likely sufficient. We present SiDoLa-NS primarily to highlight its versatility, as it can be adapted from single-nucleus tasks to whole-brain simulations.

SiDoLa-NS is trained on biophysical simulations of brain slices and used to map brain regions in a histological section of mouse brain

We sought to assign regional identities to the detected nuclei. To accomplish this, we simulated fully synthetic coronal slices of mouse brain (single hemispheres and full slices) with named regions based on the Allen Brain Atlas reference (Fig. 2C 1). CNNs were trained on 144,438 training images and 1092 validation images. SiDoLa-NS delineated distinct brain areas (Fig. 2C 2, 3). The primary CNN (SiDoLa-NS-Macro-mCB) achieved an excellent precision-recall curve (F1 = 0.95 at a confidence of 0.381) and the confusion matrix on the validation set indicated negligible misclassification. We measured a top mAP50 of 0.977 on the validation set (Fig. S3).

Another feature of the SiDoLa-NS suite is robust performance with noisy or blurry samples. We tested SiDoLa-NS-Macro-mCB on images where noise was introduced in a calibrated fashion (“Methods”). SiDoLa-NS maintained mAP50 above 0.783 up to 70% noise mix (Fig. S4A, B). Next, we simulated images with complete (even) and regional (distorted) focusing errors. SiDoLa-NS-Macro-mCB had mAP50 above 0.803 and 0.722 across even and distorted images, respectively (aperture f-number was brought as low as 0.0013). We tested robustness against staining modality, inferring on unseen Nissl-stained brain samples with labels provided by the Allen Brain Atlas. SiDoLa-NS was able to accurately delineate each brain region (Fig. S4C-E). We next asked how SiDoLa-NS labels compared with outside expert annotations of mouse brain regions74 and found high correspondence across Cortex, Thalamus, Striatum, and Hippocampus (Fig. S5).

To assess the complementary roles of top-down and bottom-up approaches to cellular labeling, we compared SiDoLa-NS to Visium HD’s Seurat pipeline for clustering and spatial overlay (Fig. 2D, “Methods”). In practice the two methods work in concert: top-down excelled at finding boundaries and full regional coverage, while bottom-up distinguished unique cell types within regions. For example, cells that SiDoLa-NS assigned to the Striatum-Like Amygdala, Lateral Hypothalamus, and Thalamus were difficult to differentiate by molecular cluster. Using both approaches together can help disambiguate bottom-up numbered clusters and map them to top-down named atlas regions (Fig. S4F).

We also compared our image-based regional definitions against expression-based regional definitions. Since “bottom-up” methodologies perform clustering without assigning biological labels, overlap with molecularly defined regions was the basis for our comparing regions. The comparisons (Fig. S6) are against MILWRM (Multiplex Image Labeling with Regional Morphology), SpaGCN (Spatially variable genes by Graph-CNN), BANKSY (Building Aggregates with a Neighborhood Kernel and Spatial Yardstick) and GraphST26,28,75. MILWRM performed the best; BANKSY, SpaGCN, and SiDoLa-NS were close behind (accuracy 69.4%, 64.4%, 63.9%, and 60.1%) and GraphST results contrasted the most with patterns defined by MILWRM (accuracy 51.7%). We noticed that molecular-defined labels incorrectly included hindbrain and olfactory bulb in a mid-coronal slice, suggesting expression-based methods lack anatomical resolution and are more prone to region misidentification.

SiDoLa-NS for spinal cord nuclei detection and atlas mapping

SiDoLa-NS was evaluated on publicly available immunofluorescence (IF) sections of mouse spinal cord stained with DAPI (for DNA in nuclei), NeuroTrace (a fluorescent Nissl which stains neurons), and ChAT antibody (marks cholinergic neurons)70. At the micro-scale, SiDoLa-NS-Micro-CNS segmented the nuclei despite this IF section being lower resolution than the brain (Fig. 3A 1, 2, 3).

Fig. 3: Segmenting nuclei and mapping atlas regions in a mouse spinal cord section.
figure 3

A (1) At the micro-scale, SiDoLa-NS trains on simulated data. (2) Micrographs of tiles from cross sectioned spinal cords from dataset70. (3) SiDoLa-NS-Micro-CNS predictions (blue squares) of nuclei in the spinal cord. B (1) At the macro-scale, SiDoLa-NS trains on simulated images of spinal cord sections. (2) Micrographs of mouse spinal cord cross sections. (3) SiDoLa-NS-Macro-mSC prediction on different spinal cord regions (different colors). C Cellular and regional results on the whole spinal cord section. Marker size is governed by the measured cell size, and coloration is by regions, some of which are labeled. D Cellular representation of the same spinal cord section from (C) but colored to define 3 main regions within the spinal cord (blue Dorsal horn, yellow Ventral horn, and green is White matter). Below a bar chart depicts the nuclei count per region. E, F The cellular spinal cord representation, where cells are colored to depict. E Nissl/NeuroTrace+ neurons in orange, and F ChAT+ cells in red, with the bar charts measuring the % of cells positive in each region.

We then simulated whole and hemi spinal cord histological slices and varied parameters including noise, warp, and tissue degradation (Fig. 3B 1). With this simulated image set (12,609 training, 30 validation), we trained SiDoLa-NS-Macro-mSC. Predictions on biological IF whole-tissue sections (Fig. 3B 2) highlighted various spinal lamina and prominent regions in the white and grey matter (Fig. 3C 3). SiDoLa-NS-Macro-mSC achieved a top precision of 0.96 at a confidence of 0.92 on the validation set (Fig. S8). We next examined its robustness to variations in focal distance and tissue rotation; the CNN maintained an mAP50 above 0.70 within a range of -200 µm to +1000 µm from the standard distance in training images (Fig. S8A). Similarly, we found SiDoLa-NS-Macro-mSC had mAP50 above 0.60 up to 40 degrees of rotation along the y-axis (tilting one edge of the image towards the camera).

We evaluated SiDoLa-NS-Macro-mSC on an unseen Nissl-stained image to examine the model’s abilities under varying staining conditions. SiDoLa-NS-Macro-mSC achieved a top precision of 0.96 at a confidence of 0.929 (Fig. S8B, C). The mAP50 was class-variable: larger, well-bounded regions like dorsolateral fasciculus achieved a precision of 0.995. Thinner layers, like Lamina 1 and 2, had lower precisions (0.145 and 0.622, respectively).

To validate regional assignments on the IF mouse spinal cord sections, we compared the staining intensities in the predicted atlas regions. SiDoLa-NS multiscale results provide quantitative measurements for predicted objects at the micro-, meso-, and macro-scales, including area, channel intensity, and texture (measured as standard deviation of pixel intensities). We combined the SiDoLa-NS-Macro-mSC predicted regions to three generic groupings: white matter, dorsal, or ventral gray matter (Fig. 3D). As expected, regions labeled white matter were located on the outer perimeter of the spinal cord, enclosing dorsal and ventral regions (Figs. 3E, S8D). We next quantified the variable feature distributions across these groupings. After applying an intensity threshold, we examined ChAT-positive cell distribution (Fig. 3F). As expected, the ventral horn was 21% ChAT+ compared to <1% in the dorsal horn76.

Axonal and nuclear segmentation within sciatic nerve fascicles

Peripheral nerves comprise multiple cell types, including multiple Schwann cell subtypes, many fibroblast populations, and numerous types of vasculature, all of which play roles in nerve health and disease77. We evaluated SiDoLa-NS’s ability to detect axons, nuclei, and meso-scale features in the nerve. In both H&E and DAPI staining, there is a distinction between myelinated axons (larger, lower intensity) and Schwann cell nuclei (smaller and brighter)78. We sought to detect axons and Schwann cell nuclei as well as the fascicles in which they are bundled within a cross section of a human-sized sciatic nerve. Thus, we created simulations (7198 training and 612 validation images, 2.78 x 105 objects in the training set) at the cellular (micro-) scale with simple geometry (Fig. 4A 1). We trained SiDoLa-NS-Micro-PNS to segment myelinated axons and nuclei as unique classes. The model achieved perfect precision at a confidence of 0.894, and an F1 score of 0.9 at a confidence of 0.4 on simulated validation images (Fig. S9). We next used a publicly available cross section of a porcine sciatic nerve (Fig. 4A 2), from which these instances were segmented effectively (Fig. 4A 3).

Fig. 4: Porcine sciatic nerve with automated fascicle identification and axon/nuclei discrimination.
figure 4

A Micrographs of a simulated cross section (1) of sciatic nerve with high intensity nuclei and dim axons. (2) Example tiles of a cross section of a porcine sciatic nerve and (3) SiDoLa-NS-Micro-PNS prediction on axons (dark blue) and nuclei (light blue) on the same sections. B Micrographs of simulated images (1) of textured bundles (fascicles) and perineurium using varying background contrast. (2) Micrographs of a cross section of a porcine sciatic nerve. (3) SiDoLa-NS-Meso-pSN prediction of fascicles on porcine sciatic nerve section (blue circles). C Ranked histogram showing distribution of myelinated-axon (blue line) and nuclei (red line) candidates’ relative pixel areas within all regions of the porcine sciatic nerve section shown (n ~ 10,000 per class). Inset is a small region of the nerve showing non-fascicle-contained cells (gray), axons (blue), and nuclei (red). D Micrograph showing overview of entire sciatic nerve, with SiDoLa-NS-Meso-pSN fascicle detection below. The dashed box indicates the area shown in higher magnification on the right. Circular markers are axons or nuclei, and their color indicates different fascicle instances. E Histograms depicting the cell contents within fascicles. The blue histogram represents axons, and the red histogram represents nuclei. Fascicles with >= 10 are shown except for one outlier with 740 axons not shown.

Many neurodegenerative diseases are largely selective for a single class of neurons. For example, sensory neurons in chemotherapy-induced neuropathy CIPN79,80 or motor neurons in ALS25, and fascicles enriched in axons from these neurons are differentially impacted81. Thus, we created meso-scale simulations for images of fascicles as ellipsoid objects with varying size, degree of overlap, and fascicle/background intensity contrast (Fig. 4B 1) with 6,588 training and 304 validation images (89,074 training objects). SiDoLa-NS-Meso-pSN, an automated fascicle segmentation CNN was trained on this dataset. On simulated validation images, this model’s F1 reached 1.0 for confidence values between 0.1 and 0.9 (Figs. S10, 4B 2,3).

Next, we applied the model on the porcine sciatic nerve section at micro- and meso-scales, where nuclei, axons, and fascicles are all one dataset. We confirmed the expected variation in the size (area) of Schwann cells and myelinated axons (Fig. 4C). SiDoLa-NS-Meso-pSN detected discrete fascicles throughout the tissue slice (Fig. 4D). We extracted the variation in fascicle content for the number of myelinated axons or nuclei revealing that porcine sciatic nerve fascicles had approximately 100 axons and 70 nuclei, consistent with manual annotation of the biological images (Fig. 4E). Thus, SiDoLa-NS-Meso-pSN enabled basic analyses of content variation between and within fascicles solely from a histological (H&E) stained nerve section.

Integrating gene expression and anatomical context via macro-scale region assignment

Thus far, gene expression patterns have been used only to validate SiDoLa-NS’s predictions at the macro- (reference) scales. The top-down analysis after SiDoLa-NS inference allows for downstream investigations via the predicted macro regions. These can be used, for example, to define marker genes for distinct tissue regions. It can also be used to better compare regional expression changes between case and control sections.

In the mouse brain, 10X Visium HD data were linked to SiDoLa-NS-Micro-CNS predicted cells. Then RNA expression patterns were analyzed across the SiDoLa-NS-Macro-mCB predicted regions. Region-specific genes (the ratio of average transcript count in regionx to that in all other regions) were visualized (Fig. 5A). Some expression profiles were similar between related regions, like Ltih3, Calb2, and Agt in the hypothalamus and hypothalamic-adjacent regions. The four top-ranked (highest-ratio) genes are known regional markers, including Tshz2 in the retrosplenial area82 and Neurod6 in the hippocampus83,84 and less commonly used marker genes like Prkcd in the ventral group of the dorsal thalamus. An extended hierarchically clustered heatmap shows additional differential gene expression (Fig. S11); some showed regional localization in the coronal mouse brain (Fig. 5B). We visualized other genes with high regional specificity, including Tac1, Penk, Mobp, Acta2, and Prox1 (Fig. S12). These genes exhibit sharply localized expression, possibly indicating functional specification.

Fig. 5: Top-down anatomical regions allow unbiased transcriptomic questions in mouse brain and spinal cords.
figure 5

A Heatmap showing the expression of the ‘top ratio’ genes for 21 regions from the mouse cortical Visium HD brain slice (green is high average counts, red is low). B Diagram of individual cells segmented with SiDoLa-NS-Micro-CNS, colored by total counts per cell for the specified gene (gray = 0, black = mean, red = max). C DAPI-stained transverse lumbar spinal cord section. There is some tissue delamination around the edges and a clockwise rotation of the slice (Scale bar 400 microns). D Cellular and regional results on the transverse section in C. Markers correspond to cells, marker size is governed by the measured cell size, and coloration is by regions. Low confidence (< 0.3) results discarded. E Heatmap showing the expression of the ‘top ratio’ genes for 14 regions from the mouse spinal cord slice (green is high average counts, red is low). Some regions were grouped for clarity: Central Region contains Lateral Cervical Nucleus and Central Canal; Dorsal Medial Region contains Dorsal corticospinal tract, Dorsal Nucleus, and Dorsolateral fasciculus; Dorsal Lateral Region contains Rubrospinal tract and Lateral spinal nucleus; Lamina subgroups were merged, including Lamina 10 with the Lamina 7 group. Highlighted genes appear again in subpanel F. F Some selected top hits from Kruskal-Wallis analysis were used to color the slice. Colors represent which of the 6 selected genes each cell expressed in the highest proportion. Note that regional distinctions are solely based on RNA expression patterns, not image features. Spp1/Meis2 marked cells co-express these two genes.

In a different 10X Xenium-processed mouse spinal cord lumbar section (Fig. 5C), a similar analysis was performed with SiDoLa-NS-Macro-mSC. The transcript panel, designed for brain-region specificity, targeted 244 genes and provided limited resolution in spinal cord. Some spinal subregions were merged to aid in visualization (Fig. 5D). A heatmap was generated following the procedure in Fig. 5A (Fig. 5E): ChAT, attributed to initiation of motor output, was most highly expressed in Laminae 7-1085,86. Spag16, which builds cilia to support CSF flow, co-localized with the central canal87. Acta2 was associated with vasculature throughout the CNS88. After revealing these gene patterns throughout the spinal cord, we performed Kruskal-Wallis analysis to sort genes by their regional selectivity. Organizing these transcriptomic “low-level” data into functionally significant “high-level” categories allowed us to generate an informative visualization of the slice’s gene expression from a bottom-up source (Fig. 5F). Nts89, Spp176,90, and Spag1687 localize as expected, whereas others like Gfap and Meis2 showed unexpected yet regionally enriched expression patterns.

Regional and cellular morphology using top-down analysis

We used a publicly available Xenium dataset including coronal slices from two transgenic CRND8 (8/18 months) mice (an Alzheimer’s disease model) and two WT (8/13 months) (Fig. 6A). We ran SiDoLa-NS-Macro-mCB on the sections to extract morphological data on single cells and atlas-defined brain regions (a one-step process) and detected differences between genotypes and age groups91,92 (Fig. 6B). We first asked which regions had changes in the number of cells across age and genotype, and found the basolateral amygdala, the dentate gyrus granule cell layer, and the lateral amygdalar nucleus (Fig. 6C upper, F-stat 20.1, 61.4, 0.63) showed increasing cell number with age in WT whereas the same regions had cell counts with the opposite trend in AD mutant mice. We noted lower region-calling confidence in the older Alzheimer’s model brain, particularly in somatosensory isocortex, auditory areas, and pallidum (Fig. 6C lower, F-stat 6.6, 47.3, 11.85), perhaps reflecting changes in these areas associated with neurologic decline such as thinning of gray matter and the cerebral cortex93 in older CRND8 mice.

Fig. 6: Utilization of image features to study anatomical regions in mice.
figure 6

A DAPI-stained mouse coronal hemi-sections from two wildtype (WT) mice at 8 and 13 months (left side), and transgenic CRND8 Alzheimer’s (TG) mice (right side). B Diagram representing the same tissue sections but with markers representing single cells, and the coloration showing atlas regions, as defined by SiDoLa-NS-Macro-mCB (coloration matches Fig. 2D). C Bar charts measuring the number of cells per region (top), and the macro confidence within the region (bottom) in the four samples, WT 8 and 13 months, and TG 8 and 18 months. The bars are grouped by region, with Basolateral amygdala, dentate gyrus granule cell layer, and lateral amygdalar nucleus on the top, and Somatosensory Isocortex, auditory areas of the Isocortex, and pallidum below. D Histograms showing the distributions of NeuroTrace (Nissl) texture (measured as standard deviation intensity) per cell, then separated by Spinal Cord regions. Horizontal lines show the mean for that region. Regions with low cell numbers were excluded. E Micrograph of Spinal cord stained with NeuroTrace, where insets show a region of lateral funiculus compared with Lamina 8, highlighting their different NeuroTrace textures.

Next, we investigated if cell morphology could be used to distinguish the atlas regions within the mouse spinal cord section. The SiDoLa-NS inference automatically measures per-cell features for each available fluorescent channel. Indeed, we found that ‘NT cell texture’ (NeuroTrace [Nissl] intensity standard deviation amongst pixels per-cell) was a strong segregator of spinal cord region (Fig. 5D, KW p≈0, H-Stat = 6139). Lamina 8 and the lateral funiculus have distinctly different NT textures (Fig. 5E). Therefore, with only image-based information at two different scales, SiDoLa-NS-Micro-CNS and SiDoLa-NS-Macro-mSC enabled spinal cord region distinction utilizing subtle morphological features.

Discussion

Since Ibn al-Haytham’s early work, optics has fascinated scientists with the power to reveal the invisible. Long before omics, histological images provided rich insights into biological architecture. Pathologists like Virchow, Malpighi, and Jacobi used microscopy to uncover unique tissue structures and pathological changes. Now with spatial omics (SO) technologies, we can link these molecular analyses with optical features of the tissues and cells and unveil new insights into the molecular underpinnings of organismal biology94,95. In the nervous system, function can be gleaned from subtle distinctions in disparate cell types and organized, intricate tissue structures characteristic of these complex tissues96,97,98,99,100. Understanding the nervous system in health and disease requires an exploration of cellular and molecular patterns within their spatial context101,102,103. By sharing a top-down approach that can be used in conjunction with SO techniques, we hope to usher in broadly applicable models capable of identifying functional regions.

Here, top-down means overlaying information evident from the image, as opposed to building up that knowledge from the molecular level (bottom-up). SiDoLa-NS is built upon the premise that training a Neural Network (NN) using only real images is insufficient. The variation in hand-labeled images provides an inadequate training set for fully generalized machine learning tools. Therefore, SiDoLa-NS adopts the novel strategy “Simulate, Don’t Label” to generate large, unbiased, and error-free datasets. The vanguard SiDoLa-NS models are used in three distinct neurological cases. First, we infer cells, lamina, and atlas regions in a coronal slice of adult mouse brain. Then, we took a mouse spinal cord and segmented it into individual cells and overlaid its reference atlas, identifying over 20 unique and identifiable regions accurately. Finally, SiDoLa-NS evaluated a cross section of porcine sciatic nerve containing distinctive bundles of axons and Schwann cell nuclei organized into fascicles, like is seen in human nerves. We demonstrate these datasets coupled with spatial transcriptomics can be used to identify marker genes of specific cell types. Additionally, the multi-scale data enables new discoveries: for instance, how are neuronal cells organized at the micro-scale to give rise to higher-scale organization in tissue? We lay out the simulation process so that others can take advantage of the pipeline for their own applications.

SiDoLa-NS and other similar approaches allow classifiers to be more general. Using NNs outside of the context in which they were trained is fraught with hallucinations and overfitting. NNs can ‘memorize’ their training data and may perform poorly on real samples because of batch effects introduced in the training set. By simulating variation beyond that seen in existing samples, the NN becomes robust to unseen variations. This subtle fact is critical, given the NNs in this manuscript are used exclusively out-of-context since the training and validation data are simulated. Another important note is that the simulated training data is generated using an engine independent from training and validation. It does not use a GAN (generative adversarial network) or diffusion approach to create the training set, avoiding recursive-generated collapse104. It is important to have a large diverse dataset to avoid producing a ‘brittle’ model. For breakthroughs like LLMs105, Alpha Fold 2106, and EVO2107, the sample size is staggeringly large. The converse, excellent quality but small datasets tend to fail out-of-context. Examples include detecting pneumonia in chest x-rays, sepsis, and other clinically-meaningful failures108,109,110. Often these smaller-trained models learn shortcuts111 or were underspecified112. Small language models may be good for niche, specific tasks, although even a paper arguing for quality over quantity only adjusted quality by increasing the duplicate data in the sample set, which is effectively making it a smaller sample size113.

This approach permits the generation of massive and high-quality training data and the representation of difficult and rare cases. It is important to note that existing tools such as CellProfiler perform comparable cellular annotations to SiDoLa-NS Micro. However, SiDoLa-NS can perform at cellular (ex: individual nuclei), meso (ex: local tissue structures like bundles of axons), and atlas (ex: entire brain/spinal cord) scales, not just identifying different cluster or patterns, but specifically identifying named brain or spinal regions. SiDoLa-NS models are also robust to high cell density when image parameters vary over a wide range. To overcome the inherent variability of manual annotation, this approach eliminates annotator bias by using simulation-derived ground truth, yielding precise and consistent object boundaries free from optical and sample artifacts. Grounded in biophysical principles, the simulation produces results that are both accurate and highly interpretable.

Although this process allows data-driven discoveries relating to well-understood functional categories, it is still essential to parse out meaningful findings from noise. Some noise comes from sample preparations for spatial analysis that may disfigure the tissue or destroy transcripts before they can be sequenced. Interpreting expression findings can also be challenging. For instance, the glial marker, Gfap is expressed throughout the spinal cord, and Meis2 appears to be linked to dorsal root ganglia114, which innervate intermediate laminae115. Additionally, changes in the pallidum’s shape often occur over longer time periods116,117, though the reduced confidence we noted in Fig. 5 may signify other well-documented changes in or around the tissue118.

With spatial-omics datasets, there is the possibility of double-dipping, which refers to conflating the significance of region-defining molecular markers with results. Although it is useful to enhance top-down image analysis with bottom-up clusters, the statistical significance of downstream analyses may be falsely inflated if compounded with assumptions made in earlier steps. Using top-down regional definitions avoids these problems119,120,121.

This work does not degrade the significance of bottom-up approaches. Rather than pitting the two approaches against one-another, we argue the relevance of both approaches for pathology and function. Comparisons oversimplify the relationship between the top-down and bottom-up approaches. Molecular definitions inherently have no “label” beyond significant genes, leading to confusion. As precision medicine defines the treatment and outcomes for more patients, the need for these top-down approaches may be limited to smaller domains. Until then, utilizing the rich features laden in microscopy will continue to be a valuable source for clinicians and researchers to draw from.

Segmenting complex but specific shapes can benefit from state-of-the-art SAM or SAM2 models122. These approaches have shown strong results in polyp, cardiac, and skin lesion settings123. However, they struggle when boundaries are ambiguous and are not inherently designed to handle multiple classes. For example, identifying the boundaries of constellations in a night sky image is difficult because no visible pixel edge marks where one ends and another begins—the boundaries exist only by reference to external definitions. The same challenge applies to segmenting nuclei in the brain or spinal cord, where the “boundary” reflects a change in pattern rather than a distinct pixel transition.

Additional exploration in NN architecture would lead to even better inference. The NNs in the manuscript are instance segmentation and/or instance classification models built on the YOLO124 Ultralytics framework. Instance segmentation is powerful since it identifies unique regions and separates individual objects, such as cells, so we can measure the features that distinguish them. We also tested MaskR-CNN125 and Feature Pyramid networks126 but found YOLO to have the most active development and efficiency. Still, YOLO and instance segmentation may not be the right approach for all applications, particularly when region boundaries are diffuse or when per-pixel accuracy is critical. Convolutional UNets typically perform anatomical region definitions tasks better, since each pixel’s location and identity are consistent. In fact, we found that for some applications, a two-stage approach can enhance performance. First, UNets are trained to ‘recode’ the raw image with pixel-based classifications. Then, an instance segmentation model is trained from those, leading to better performance (data not shown).

In summary, SiDoLa-NS enables robust, generalizable spatial analysis through simulation-trained segmentation at multiple spatial scales. By defining regions based on images of tissue structure, it allows multimodal distributions of cells and transcripts to be mapped and analyzed in context—crucial for resolving complex tissues. This top-down approach, when integrated with SO, opens the door to scalable, interpretable models that bridge structure and molecular function. Integrating this SiDoLa-NS with empirical pipelines and adaptive architectures that combine real and simulated data will produce further advances in our ability to analyze SO datasets.

Methods

Atlas mapping

Resources from the Allen Software Development Kit (SDK) were utilized for the atlas mapping pipeline of the mouse brain and spinal cord. Each atlas image had a respective structure image with color-coded regions. The structure image for each atlas image was retrieved and a dictionary for regions and their respective color code was constructed. Next, binary masks in each color code were constructed over all structure maps to visualize the unique regions. Polygon coordinates defining the contours of each region were produced and assembled into a JSON file, organized by slice number and region name. These region outlines were used to generate macro-level brain and spinal cord simulations.

Simulations overview

Simulations have the advantage of providing truly massive datasets, where we can easily generate hundreds of thousands of images in a few hours. There is no annotator bias, perfectly resolved ground truth segmentation masks and labels, and no memorization issues since the classifier never sees real images until inference. With the simulated images, we represent examples of various tissue and optical aberrations that may be present in a physical tissue section:

  1. 1.

    Out-of-field sections are characterized by missing regions which fall out of the field of the image

  2. 2.

    Broken tissue is a common occurrence during the tissue handling process, resulting in “cracks” in the tissue

  3. 3.

    Poor focus/contrast/resolution is a result of the automated imaging failing to capture photos in the highest fidelity, and

  4. 4.

    Distorted or warped tissue sections are also likely a result of the tissue handling process.

These aberrations are not easily reversible, and so analysis must be resilient to variations and combinations of these. We also over-represent examples that are difficult for existing segmentation algorithms, such as extremely high-cell density. Finally, ground truth labels and segmentation masks for the training images can be dynamically updated during simulation generation, enabling the entire schema to evolve—an impractical task if manually re-labeling 100K images. At no point during training are real biological samples used; both training and optimization utilize simulations, reserving the biological samples solely for downstream testing and analysis.

All simulations were done in the CAD software Blender 4.0+ using a feature called ‘Geometry Nodes’ which allowed for a visual workflow that encoded the geometry representing different scales of biology. Individual ‘frames’ were used to control variations and produce the different training and test image datasets needed for teaching the NNs. While each tissue type and scale used distinct geometries, a few general rules were common: The basic concept was to build a plane or volume to fill with cells, then to ‘instance’ cell objects and randomly fill them into the region. Both the regions, the fill density/pattern, and the cells were all created dynamically, and random variations were added for each frame. The more variation that was shown to the NN, the better it was able to generalize on real tissue.

Simulation engine details

The exclusive use of simulated microscopy images allows for generalizable identification of the nuclei or cell bodies, organized cellular clusters, and reference regions from anatomical atlases. However, these systems are only as capable as the simulated images from which they are trained. We asked whether we could develop a framework to produce biophysically informed simulations that would be compatible with the SO platforms discussed in Fig. 1. The pipeline below was used to create the simulations used in this manuscript but could also serve as a baseline from which to build other synthetic datasets for anatomical regions of interest.

Continuing the multiscale theme, this pipeline operates at micro-, meso-, and macro-scales (Fig. S13A-C). Each biological component is built as a 3D structure with a variety of properties defined and bounded by input parameters58. These components can be thought of as a function, and we represent them as a node with input parameters and outputs that return the geometry of that component (Fig. S13D). So, we first define the features of single cells (Cell Node), such as size, shape, and intensity, all of which can vary based on cell type and scan parameters and can be controlled through input parameters. The geometry for these (unique) individual cells is then used as an input to the next block (Region Node), which defines the makeup of a region or cluster (the details for the processes in the pipeline are defined in more detail below in Fig. S13E and are indicated by a banner along the top). Cells are scattered throughout this region, with parameters for density and regional variation. The shape of these regions is either simulated generically, for example as laminar stripes, clusters, rings or is generated from a set of polygons which define a region from an anatomical atlas. Specifically, for sciatic nerve fascicles, we used rings of cells around a central ellipse of cells. For mesoscale inside the brain, we used higher density lamina that traverse across an image region as a stripe. Next, the geometry from each region is output and used to assemble the whole section (Section Node) that defines what will be a whole output image. Here, the parameters are randomized over a uniform random distribution to change the atlas morphology, so that it includes normal physiological variation but also goes beyond to encompass a variety of pathological phenotypes.

We utilized the open-source CAD software Blender to produce simulations, since it can be controlled through python scripting and with a visual generative workflow, called Blender Geometry Nodes (BGN), and has extensive optics and physics capabilities. The BGN pipeline for a micro-scale simulation focuses on cellular details (Fig. S14). A simplified version of the BGN pipeline that was used to create the macro mouse coronal brain dataset features transforms to distort the regions and overall section (Fig. S13E). The banner displayed over the pipeline defines the primary functions of each part of the pipeline: Regions, Cells within Region, Edge, Tears, Skew, Deform, and Empty. For the macro-scale reference regions, we defined their boundaries as exact copies of the reference polygons for one hemisection and then had a switch to mirror those regions into a full set of hemispheres or just to use a single hemisection. The density of cells within the region was defined from a table (which we could manually set) or varied randomly between regions. Varying cell density randomly has the benefit of reducing our models’ dependence on density in learning region labels. Each region was produced iteratively, indicated under ‘Cells within Region’. The geometry from the regions and cells contained within was then joined together to define the whole section.

From here, the geometry of the simulated tissue section was manipulated. Specifically, we include within the BGN 1) Edge, which are extra cells and debris that are beyond the defined regions of the atlas (as there is often extra chunks of tissue that show up in a tissue section from errors in histological slicing), 2) Tears, which are meant to simulate rips or tears in the tissue. These two features are disabled when generating ground truth, as we want to reconstruct the tissue as if these imperfections didn’t exist. Additionally, we include two distortions: 3) Skew, which simulates different cut angles in sections and 4) Deform, which changes the relative size of the regions as would occur in samples from different individuals (the variation included purposely goes beyond common variation). Finally, there is a block that regulates 5) Empty, or background simulations, to reduce the influence of background variation and false positives in object recognition. More examples of these manipulations are shown in Figs. S15, S16 and a practical guideline is included in https://sidolans01.mgifive.org/. Although not shown in Fig. S13, shading information is included to control how light is emitted from these objects and how it can interact with each object is also controlled using Blender Shader Nodes to enable variation and patterning.

At this point, the complex 3D geometry needs to be transformed into a 2D image, and the steps for that process are displayed below (Fig. S13F). To do this, we use optics and the rendering engine within Blender. Specifically, we employ a virtual camera with a 15.5 mm focal length and a low f-stop (0.01) to focus on the tissue section from about 1.4 mm away. Then the rendering engine ‘Cycles’ calculates the light being emitted by or bouncing off the samples and uses it to produce an image. After the image layer is produced, we use a compositor layer to add varying amounts of white noise (0–95%), making the image slightly harder to interpret. This process is repeated (per ‘frame’) for the total number of images and produces the simulated ‘microscope’ images for our dataset. At the same time, the software also calculates the boundary coordinates of the various objects within the scene from the camera’s point of view, and projects them onto a 2D plane so they are scaled to the rendered image. These coordinates are used to export the ground truth data, a set of polygon coordinates and class labels. For the ground truth construction, a switch is flipped that turns off some of the variation, such as the rips and the noise, so a clean reconstruction can be learned from the noisy microscope image. Also, during this process, all the parameters utilized are saved as metadata and are exported alongside the image and label files, so all the details can be tracked for different rendering runs and versions. In the macro mouse brain set, we rendered 4,065 images/hour on multiple desktops, with ~700 image sets an hour on a computer with only an 8th generation Intel CPU or NVIDIA 1000 series graphics card (i.e. standard, not specialty hardware).

The parameter input settings are constant for a single ‘frame’ of the BGN pipeline but are varied for different frames. Therefore, when producing the dataset, the software runs through one to hundreds of thousands of frames, creating the combinatorial variability of the training set. At the same time, we also produced a small, simulated validation dataset (hundred to thousand images). These images can then be fed into any number of image-based NN, such as a convolutional UNet architecture, instance segmentation architectures such as Mask-RCNN or YOLO124, or used with Vision Transformers (ViT). In this manuscript, we presented data from YOLO instance segmentation, and neural architecture search (NAS) was used to find the optimal settings. The metadata exported during rendering is loaded during training so that it can be later utilized during inference.

SiDoLa-NS-macro-CNS mouse coronal brain and spinal cord

These sections were both produced with the same pipeline explained in Fig. 6. After atlas mapping, the regions were imported into Blender as curve objects using a custom script. These were fed into the Geometry Nodes pipeline described extensively in Fig. 6E. Additional details and images are found in Figs. S12, S13.

SiDoLa-NS-macro-CNS comparison to bottom-up approaches

The comparison of our image-based regional definitions against expression-based regional definitions was done using Kaur et al.’s75 consensus tissue domain detection on a publicly available mouse coronal (standard) Visium dataset. We applied SiDoLa-NS-Macro-mCB to the whole mouse brain slice, leveraging the automatic linking of region labels to cell-level segmentations and transcripts. We compared the atlas predictions (top-down) to the bottom-up directly by overlaying overlayed the predicted region labels on the Seurat-generated UMAPs and compared this to the overlaying the Seurat clusters over the brain image. This yielded four sets of labels to compare with SiDoLa-NS-Macro-mCB atlas labels. All the methods used the expression data and variations of unsupervised clustering to define numbered discontinuous spatial regions. This comparison is biased towards the expression techniques, since we compared all the labels back to the molecular definitions of the regions from the consensus tissue domain manuscript.

SiDoLa-NS-meso-pSN porcine sciatic nerve

Sciatic nerves were digitally generated using Blender with geometry nodes to simulate biophysical properties. The aim was to replicate natural structures within a sciatic nerve, including fascicles and Schwann cells. The Blender simulations were simplified to basic geometric forms, including ovular fascicles, circular Schwann cell nuclei, and circular epineurium. Additionally, varying sizes and counts of fascicles were distributed to ensure a manageable, yet flexible, simulation for training. This effectively prepared the model for real-world data. These simulated components were meticulously rendered into a set of.bmp files through Blender scripting, which produced the dataset.

SiDoLa-NS-micro-CNS mouse coronal brain and spinal cord

The basis of this simulation was to instance ellipsoid nuclei on many points across a plane. The nuclei varied in size, aspect ratio, rotation, and shading (which featured bright heterochromatin spots). We also introduced two other meso-scale features. The first of which was a bright laminar stripe or striation. These were bands of much higher density nuclei compared to the adjacent cells (Fig. S14). The other feature was a region devoid of cells, usually thinner than the high-density lamina, that also crossed the image. Finally, a slight amount of depth-of-field with a low f-Stop was used to simulate slight changes in focus.

SiDoLa-NS-micro-PNS mouse peripheral nervous system

This micro model is the original from which others were derived. It consisted of a plane that was filled with points that were split between two different cell type instances, each of which could be scattered in different ratios. One cell type was Schwann cell nuclei, which are more round, small and bright. The other type was axons, which are slightly larger, often oblong, and dim to near invisible (by DAPI staining). The plane which contained these was circular shaped and had additional geometry outside of it representing the epineurium. The plane was also shifted around the frame of the camera so that occasional different edges of the perineurium were exposed.

Optics, noise, and ground truth polygons

A virtual camera within Blender was used to translate the three-dimensional geometry into an image. This was done with the rendering engine “cycles” or “EEVEE” and involved calculating where the emitted light from the fluorescent objects fell or where light from the background in an IHC setting was blocked by the objects. The camera had depth-of-field and a small aperture to give slight out-of-focus regions. The camera was also tilted and twisted to give different viewing angles and raised or lowered to adjust the overall field-of-view and zoom. White noise was added on top of the final image with the compositor and mixed at different ratios to make the SiDoLa-NS models robust against noise. In addition to the standard rendering, we also extracted the coordinates of the vertices making up the 3D geometry, and projected them onto a plane from the camera’s point of view, converting them into camera x,y coordinates (world to camera). These vertices were then reassembled into polygons directly, or convex hull was applied. In more complex setups (atlas regions), we also down sampled the resolution of the polygons so that the exported segmentation files were more manageable. These files were jointly exported in Yolo.txt or coco.json format automatically as the images were rendered.

Model training

The images created could be used with a variety of instance segmentation or object detection architectures. All the inferences in this document utilized Ultralytics YOLO v8 or v9 pre-trained foundation models in a Python environment with CPU or GPU. For micro models, where the exact outline of the nucleus was not required, we trained ellipsoid nuclei segmentations into bounding boxes with YOLO v9c and v9e. For more complex shapes, like in the macro models, we trained with YOLO v8n-seg and b8l-seg foundation models to extract the full polygon masks from the inferences. The number of images used in training is listed throughout the results. Generally, the training was for 10-200 epochs, the image size was between 448 and 640 pixels in width and height, and the batch size was between 8 and 32 images per batch. All these factors were varied, along with the random seed, for each NAS run. The results of the run were copied into a folder that was named by the resulting mAP75 on the validation dataset, and then evaluated on biological examples as a quick reference.

Transcript processing

The 10X Visium HD Brain, acquired from a public source, was provided at 2, 8, and 16 µm resolution in the binned outputs folder. We used pandas to export the tissue parquet files as CSV files, acquiring the SRT data in micron coordinates. The 10X Xenium Spinal Cord was provided as a CSV file, with individual transcripts identified by micron coordinates. The CSV mapped directly to the DAPI-stain image of the tissue sample. Given the slice’s width and height in microns, SiDoLa-NS automated the scaling required to map the transcripts to the segmented cells.

Data collection and image preprocessing

Mouse coronal brain

The Visium HD Brain H&E image was acquired from 10X Genomics. The microscopic image provided was an H&E-stained section. Utilizing ImageJ, we split these images into its constituent channels and selected the channel that best highlighted nuclear features (red channel) for further processing. For both cases, we adjusted the image brightness and contrast, applying a linear normalization to the whole image. For macro/meso/micro inference, these images were directly given to the program, without down sampling or tiling. For macro, an ensemble of three atlas models were used. The macro class is assigned based on whether the center of the cell is within the boundary of the larger region.

Mouse Alzheimer’s brains

Four brain hemi-sections from a publicly-available dataset, “Xenium In Situ Analysis of Alzheimer's Disease Mouse Model Brain Coronal Sections from One Hemisphere Over a Time Course - 10x Genomics” were downloaded. The DAPI images were rotated to be upright in ImageJ, and then targeted for a macro/meso/micro inference as above. Statistics for effects on these tissues were linear regression against this scale: WT 8 = 0, TG 8 = 0.2, WT 13 = 0, TG 18 = 1 for regional confidence, and this scale WT 8 = 1, TG 8 = 1, WT 13 = 1.5, TG 18 = 0.5 for cell numbers per region.

Mouse spinal cord immunofluorescence

For the spinal cord analysis, we acquired the IF images from a spinal atlas manuscript70. The spinal cord dataset included nine different cross sections of spinal cords, each containing three different channels of ChAT, Nissl, and DAPI stains. We selected Slide1-6_Region0007_Channel555 nm,475 nm,395 nm_Seq0051 and labeled each spinal cord from one to nine starting from the top and moving horizontally. Taking the second spinal cord, we cropped it into a square and split it into three separate channels using ImageJ. The Nissl channel was then used for our SiDoLa-NS models’ predictions.

Mouse spinal cord xenium

Mice were housed and used following the institutional animal study guidelines and protocols approved by the Institutional Animal Care and Use Committee of Washington University in St. Louis. For spinal cord dissection and fixation, a C57Bl6-J female, 109 days old mouse was first deeply anesthetized in a 5% isoflurane induction chamber until complete loss of pedal reflex was confirmed. While under deep anesthesia, the animal was euthanized by cervical dislocation followed by thoracotomy. The right atrium was then incised, and transcardial perfusion was performed through the left ventricle with 10 mL of ice-cold RNase-free phosphate-buffered saline (PBS) to remove blood, followed by 10 mL of freshly prepared 4% formaldehyde (Sigma Aldrich, cat. 100496) for tissue fixation. To preserve RNA integrity, the entire spine, including the spinal cord, surrounding bone, and musculature, was rapidly dissected and incubated in 4% formaldehyde at 4 °C for 48 h. After fixation, the surrounding bone and muscle structures were removed by dissection, and the exposed spinal cord was transferred to 70% ethanol at 4 °C until processing. The fixed spinal cord tissue was dehydrated and embedded in paraffin using a Tissue Processor (Leica TP 1020) through graded ethanol, xylene, and melted paraffin immersions. The spinal cord sample was then embedded in a cold paraffin block, sectioned into 6 µm slices with a microtome, and floated in a 42 °C water bath. Tissue sections were mounted on Xenium slides (10X Genomics, PN-1000460) within the 12 mm x 24 mm imageable area. Slides were dried for 30 min at room temperature, incubated for 3 hours at 42 °C in a dryer oven, and placed in a desiccator to dry overnight at room temperature. Finally, the slides with the lumbar region were hybridized with 10X Mouse Brain Panel probes (which included ChAT) and processed using the Xenium Analyzer following the manufacturer’s instructions.

Porcine sciatic nerve

The porcine sciatic nerve slide was obtained from Saarland University’s histology site https://mikroskopie-uds.de/127. The sciatic nerve cross section had a scan area of 6.2 mm x 4.8 mm and H&E staining. The image was downloaded as a.zif file, which was then loaded into ImageJ and the highest resolution version of the sciatic nerve was exported as a 38,885 x 30,286 pixel 8-bit image. The channels were then split. Both channel 1 and channel 2 highlighted the fascicles within the sciatic nerve. Due to channel 2’s more distinct contrast between the fascicles and the background, channel 2 was used for Fig. 4. Channel 3 highlighted the axons within the fascicles. The pixels were then inverted to create a black background with white fascicles and adjusted to enhance the visibility of the fascicles and axons against the background and to mimic DAPI staining.

Inference

SiDoLa-NS must work on all parts of the image at different scales, and then that information needed to be fused together, along with any other omics data that corresponds with the image dataset. This was accomplished with the macro/micro inference package included in the code repository. The input allowed a single micro-scale model to do the cell segmentation, and then allowed as many meso- or macro-scale models to be included. Some parameters were set up for each model such as the location of the images, and then the software ran all the inferences and linked the data together. For the micro-scale, the multiple channels available from the image were measured across three masks: Box, Poly, and Voronoi (which represent the bounding box, the full polygon outline if available, and the space all around the cell until the next cell’s boundary is reached). Area, Size, Intensity, and Texture (SD) were measured for each mask and reported on a per-cell basis. If there was more than one macro-scale model, then hard and soft-voting ensembles were generated and reported.

Analysis

Model evaluations

SiDoLa-NS models were evaluated during training utilizing the Ultralytics validation tools. For all our models, the validation set used during training consisted of simulated images. For each validation image, many metrics including precision, recall, IoU, and confidence were recorded. Precision and recall were used to generate an F1 score (\(\frac{{precision}}{{recall}}\)) as well as graphed across confidence values. Additionally, a confusion matrix was generated for visualization of classification accuracy, particularly in multi-class tasks like brain and SC atlas mapping.

Robustness analysis

Various robustness analyses were conducted on the macro models for noise, degree of rotation, and focus. We generated a small set of 18 – 25 images with calibrated variations in one of the three ways. For noise, we altered the image resolution by reducing the pixel contrast. For degree of rotation and zoom, the camera was respectively rotated or moved along the z-axis relative to the reference object. We also adjusted the F-number for the camera’s aperture for focus. For each of those images, we additionally generated a ground truth, which was used in evaluation. The mAP50 was primarily utilized in assessing model performance.

Marker gene analysis

Because SiDoLa-NS-Macro-mCB and SiDoLa-NS-Micro-CNS were tested directly on the Visium HD mouse brain slice, we mapped the gene expression matrix at the 16 µm bin level to the data based on micron position. For the Xenium spinal cord slice, SiDoLa-NS-Macro-mSC and SiDoLa-NS-Micro-CNS were both evaluated, and transcripts automatically assigned to cells. This resulted in each detected cell having a brain region assignment and a known gene count. For the list of the ‘top ratio’ genes, we ran a Kruskal Wallis test including all genes against brain regions to determine the highest rank of genes. From there, the ratio of each gene’s count to the total region gene counts allowed us to rank these genes per region, yielding our list of the top 4 genes.

UMAPs and clustering

The UMAPs were viewed using the Seurat embeddings from the Visium HD dataset at the 16um bin level. The UMAP1 and UMAP2 data were plotted and colored by cluster. For the cluster definitions, we utilized the provided graph-based clusters, which we joined to the UMAP with the barcodes. The SiDoLa-NS UMAP in Fig. 2D uses the same underlying projection data but was colored with the SiDoLa-NS-Macro-mCB predictions for the respective barcode region.