Main

Multiple sclerosis (MS) is the most common inflammatory disease of the central nervous system, characterized by multifocal demyelination, axonal damage and chronic tissue remodeling including gliosis, neuronal loss and eventually brain atrophy1,2,3. An issue with chronic MS is compartmentalized inflammation4, particularly at rim areas of subcortical white matter lesions5. Several postmortem and magnetic resonance imaging studies have demonstrated that those chronic active lesions often show iron uptake at lesion rim, centrifugal lesion expansion and disease progression6,7,8,9. Therefore, understanding the cellular composition and molecular dynamics of the lesion microenvironment would have a critical impact on future biomarker and interventional studies in people diagnosed with MS.

MS lesions follow temporal and spatial patterns of inflammation and tissue damage, thus providing indirect information about the lesion stage10,11,12. Acute MS lesions commence through active myelin breakdown with presence of myelin-phagocytosing cells, including macrophages, activated microglia and astrocytes13,14,15. Lesions may then remyelinate16 or enter a chronic active stage with a distinct inflammatory rim and a well-demarcated demyelinated core6,7. Eventually, inflamed lesions can become inactive, without rim inflammation but low-level microglial activation and a dense glial scar formed by astrocytes17,18. Nevertheless, the precise lesion kinetics remain unclear, as chronically inflamed and iron-positive rims can persist for years in certain people with MS5,19. Over the past decades, histopathological and magnetic resonance imaging studies have collectively provided insights into inflammatory lesion development and temporal lesion dynamics20,21. However, due to a lack of spatially resolved molecular tools, it has not been possible to discern spatially restricted areas of cell-type-specific pathology and map back those changes to defined lesion and nonlesion areas.

In this study, we paired single-nucleus RNA sequencing (snRNA-seq) with spatial transcriptomics (ST) to create a large atlas of subcortical MS lesion pathology. We characterized how cell types organize in MS lesions and found a previously unreported ciliated astrocyte subtype in the demyelinated core of MS lesions. We reported the molecular differences between control, chronic active and inactive lesions, revealing distinct tissue niches. Finally, we identified spatially restricted cell–cell interactions involving glial, endothelial and immune cells specific to the rims of chronic active lesions. Our findings can be further used to target potential key events in MS pathology.

Results

Transcriptomic cell type mapping of subcortical MS lesions

We characterized a total of 12 MS lesions by their level of demyelination and inflammation, and seven control (CTRL) samples from subcortical control white matter. Based on the presence of a demyelinated lesion core (stained with Luxol fast blue (LFB)), activated myeloid cells (CD68 and CD163 immunoreactivity22) and iron uptake at the lesion rim, eight lesions were classified as chronic active (MS-CA). The other four lesions with a demyelinated core and low-level rim inflammation were classified as chronic inactive (MS-CI) (Fig. 1a and Supplementary Table 1; Methods). All samples met sequencing quality standards and were profiled for snRNA-seq and/or ST, resulting in an ST study cohort of seven CTRL, eight MS-CA and four MS-CI lesions (n = 19), from which five CTRL, six MS-CA and four MS-CI were paired for snRNA-seq (n = 15) (Fig. 1b and Supplementary Table 2; Methods).

Fig. 1: Spatial and cell-type profiling of subcortical control and MS tissues.
figure 1

a, Lesion characterization design (left) and histological assessment (right) of MS lesions with IHC for CD163 and CD68 (myeloid cells) and stains for iron and LFB (myelin). Scale bars, 500 µm (Overview), 100 µm (CD163, CD68, Iron, LFB). b, Study design showing different data modalities and associated metadata. Links between samples indicate that they come from the same tissue block. NA, not available. c, Barplot with number of cell subtypes per main cell type (left), UMAP of integrated snRNA-seq data (n = 103,794, center) and dot plot of averaged z-transformed gene expression of marker genes for each cell type (right). Color corresponds to annotated cell types. d, Spatial panel displaying H&E staining (left), feature plots of cell type deconvolution results (center) and inferred pathway activities (right) in CTRL and MS lesion types. e, Pipeline design using MOFA+ to generate a spatial clustering from the integration of gene expression and deconvoluted cell type proportions (left) and comparison of histopathological areas with the computationally annotated niches between lesion types (right). f, Dot plot of averaged z-transformed gene expression of marker genes for each niche. g, Heatmap of z-transformed cell type proportions for each niche, asterisks indicate significance (adjusted P < 0.05). Min., minimum; max., maximum.

For the snRNA-seq dataset, we obtained a total of 103,794 nuclei (n = 16, average = 6,487) with a mean of 2,125 detected genes per nucleus after removing low-quality profiles and potential doublets (Extended Data Fig. 1a–d; Methods). Based on pan-cell type marker genes and previous reports5,23, we identified and annotated nine main cell types, including oligodendrocytes (OL), oligodendrocyte progenitor cells (OPC), astrocytes (AS), myeloid cells (MC), endothelial cells (EC), T cells (TC), B cells (BC) and stromal cells (SC) (Fig. 1c and Supplementary Table 3; Methods). We also captured some neuronal cell clusters (NEU) during tissue sectioning that originated from surrounding cortical and subcortical gray matter areas. Comparison of our snRNA-seq atlas with another recent study focusing on subcortical MS lesions5 confirmed presence and identity of all main cell types (Extended Data Fig. 1e; Methods). Principal cell types were further subclustered and annotated into 79 different subtypes based on marker gene expression, generating a fine-grained annotated atlas (Fig. 1c, Extended Data Figs. 2a–d and 3a–d and Supplementary Table 4; Methods). Cell types were well represented by several samples and MS lesion types, suggesting annotation robustness (Extended Data Fig. 1c,d).

For ST, we obtained a total of 67,851 spots (n = 18, average = 3,769) with a mean of 1,373 genes per spot (Extended Data Fig. 4a,b). We aimed to capture the lesion rim at the center of each slide to study spatial changes on both sides (periplaque and lesion core). As described above, LFB stains and CD68/CD163 immunohistochemistry (IHC) were used as standard histological techniques to identify lesion rim areas in MS sections (Extended Data Fig. 4c). Since ST can capture several cells per spot, we improved its resolution by inferring cell type compositions across spots (Methods). Using the paired snRNA-seq atlas as a reference, we deconvoluted ST slides and found that cell types spatially mapped the expected tissue architecture across CTRL tissue areas and MS lesions (Fig. 1d). CTRL slides were composed uniformly of OL, whereas in the demyelinated core of MS lesions we found a high abundance of AS, characteristic of the glial scar24. Predicted cell type proportions in ST were similar to those observed by snRNA-seq, both at the sample and cell type level (Extended Data Fig. 5a,b; Methods).

Further, we performed pathway enrichment analysis to explore biological functions in lesion and nonlesion areas (Fig. 1d and Extended Data Fig. 4b,c; Methods). High myelination activity (R-HSA-9619665) was found in regions with a high OL density, such as CTRL and MS non-lesion areas25. Conversely, interferon gamma signaling (R-HSA-877300) was predicted to be active in the demyelinated core and at inflamed lesion rims. Pathways related to tissue remodeling by assembly of collagen fibrils (R-HSA-2022090) mapped strongly to demyelinated lesion areas, particularly in MS-CI lesions. Taken together, these results show that spatial cell type mapping using deconvolution prediction models, can accurately predict the presence of cell types and associated pathways in lesion and nonlesion tissue areas.

By histological and IHC assessment (Methods), we manually annotated different tissue areas: gray matter (GM), white matter (WM), periplaque WM (PPWM), lesion rim (LR), lesion core (LC) and ependyma (EP) (Extended Data Fig. 4b). However, this annotation method was biased to the dyes and antibodies used and, hence, prone to errors. For this, we designed an unsupervised approach using factor analysis26, combining gene expression and deconvoluted cell type proportions to define niches (Fig. 1e and Extended Data Fig. 6a–c; Methods). The comparison between histologically annotated areas and unsupervised niches revealed a significant overlap both at the cluster and tissue section level (mean Jaccard index 0.58; mean adjusted Rand index (ARI) 0.62, one sample t-test P = 2.36 × 10−7, alternative: ARI > 0) (Extended Data Fig. 6d). Across slides, niches exhibited higher intraniche than interniche similarity, both at the gene expression and cell type proportion levels, suggesting robust annotation (Extended Data Fig. 6e). This approach allowed the spatial profiling of different cell types per niche. For example, NEU (SYT1) were characteristic for GM, OL (PLP1, SOX10) were most prevalent in WM and PPWM, MC (CSF1R) profiles were enriched in inflamed LR and AS (GJA1) were most prominent in the LC scar (Fig. 1f,g). Computationally annotated niches showed higher granularity, which enabled us to identify a small and diverse tissue niche that we annotated as vascular infiltrating (VI) (Extended Data Fig. 6f), characterized by TC, SC and EC cell types (Fig. 1f). In summary, niche annotations revealed a precise cell type organization in subcortical MS and CTRL tissues.

Next, we performed a neighborhood analysis based on niche annotations to measure tissue compartmentalization across MS sections (Fig. 1e and Extended Data Fig. 6g; Methods). We observed two independent groups of colocalized niches: border (PPWM, LR) and lesion (LC, VI) niches. Next, we explored gene expression changes between niches (Extended Data Fig. 6h; Methods). Most of the genes in WM CTRL and PPWM were related to WM tract maintenance and homeostatic OL function. For example, HDAC11 (ref. 27) encodes a protein that regulates myelin function, ERMIN28 determines an important myelin structure protein and LGI3 (ref. 29) encodes a ligand for ion channel homeostasis of myelinated axons29,30. Conversely, the VI niche showed an upregulation of endothelial cell-type-specific genes, such as EMP2, NECTIN2 and AMOTL1 (or JEAP). For example, EMP2 regulates VEGFA31 function and expression of various integrins30, important for immune cell trafficking across the blood–brain barrier. NECTIN2 (CD112)—a ligand of CD226 (ref. 32)—is linked to MS disease worsening due to CD226 gene variants33,34. Pathways such as JAK-STAT and TGFβ, also involved in tissue inflammation and remodeling, were enriched in the VI niche. Other niches, except WM in CTRL, showed an enrichment of TNF signaling activity, indicating sustained tissue inflammation in MS-associated niches (Extended Data Fig. 6i–k). These results indicate that unsupervised niche characterization recapitulates tissue organization in subcortical MS lesions.

Collectively, integration of snRNA-seq with ST enabled us to generate a comprehensive paired dataset of subcortical WM in MS pathology, enabling investigation of gene expression profiles and spatial cell-type relationships.

Identification of ciliated AS in MS lesions

Next, we focused on the molecular diversity of AS in subcortical MS lesions. AS play critical roles in MS pathogenesis by either propagating inflammation or promoting tissue homeostasis35,36. We decoded AS transcriptomic heterogeneity and identified ten subtypes: homeostatic GM, homeostatic WM (Homeo), stress (Stress), transitioning ciliated (TransC), ciliated (Cilia), reactive (React), three different disease associated (Dis1, Dis2, Dis3) and phagocytic (Phago) (Fig. 2a,b; Methods). We then compared our AS subtype annotation with a previously published snRNA-seq atlas5 and confirmed most subtypes based on gene expression (Extended Data Fig. 7a,b). The largest number of shared marker genes was within the ciliated subpopulation, labeled previously as senescent. We noticed that the ciliated AS subtype in our dataset was overrepresented by two samples containing the EP niche (Fig. 2c and Extended Data Figs. 6b and 7c). Since AS and EP cells differentiate from radial glia and share a similar molecular profile37, we removed those samples from further analysis to avoid a characterization bias (Extended Data Fig. 7d,e). Marker gene extraction (Supplementary Table 5; Methods) identified several genes related to function and structure of the motile cilia axoneme, including genes encoding different dynein family members (DNAH11, DNAH6 (ref. 38)), cilia- and flagella-associated genes (CFAP54 (ref. 39) and CFAP299) and SPAG17, which produces an important microtubule-stabilizing protein of the axoneme40. Of note, these genes were also upregulated in the analogous senescent AS cluster5 (Fig. 2d and Extended Data Fig. 7f).

Fig. 2: Characterization of a ciliated AS subtype in MS lesions.
figure 2

a, UMAP of AS subtypes based on snRNA-seq. b, Dot plot of averaged z-transformed gene expression of marker genes for each AS subtype. c, Proportion of samples for each AS subtype. d, Violin plots of Cilia AS marker genes compared with other AS subtypes (n = 283 for Cilia AS, n = 12,704 for rest) (two-tailed t-test; BH-adjusted P < 0.05). e, Predicted number of Cilia AS per niche (empirical P < 0.05). f, Spatial feature plots showing the niches (left) and spots predicted to contain Cilia AS (right). g, smFISH for SPAG17 and ADCY2. Quantification of Cilia AS as percentage of total AS population in MS-CA. Cilia AS were quantified at lesion border (PPWM and LR) and in LC (n = 25 for border, n = 11 for LC) (two-sided Wilcoxon rank-sum test; P < 0.05). h, IF staining of cilia (SPAG17) in EP cells adjacent to the lateral ventricle. Scale bars, 5 µm. i, IF staining of damaged axons (SMI32) and cilia (SPAG17) in the LC of MS-CA lesions. j, Two IF examples showcasing the length of AS cilia in the LC of MS-CA lesions; note presence of elongated SPAG17+ cilia in AS versus EP cells. k, Dot plot of top-ranked enriched pathways. Color indicates overrepresentation significance by Fisher exact test, and size indicates odds ratio between Cilia AS marker genes and each geneset. FDR, false discovery rate. l, Violin plots of predicted transcription factor activity from snRNA-seq (n = 283 for Cilia AS, n = 12,704 for rest) (two-tailed t-test; BH-adjusted P < 0.05), showing the median (white dot), 25th percentile (Q1), and 75th percentile (Q3), with the ends of the black box indicating Q1 and Q3; the whiskers extend to the furthest datapoint within 1.5 times the interquartile range (IQR), specifically 1.5× (Q3–Q1), beyond Q3 or below Q1. m, Transcription factor FOXJ1 (square) and its target genes (circles). n, Spatial feature plots showing FOXJ1 predicted activity (blue/red) and gene expression of some of its target genes (purple/green). Scale bars, 20 µm (g, h (large panel), i, j).

To gain insight into the spatial localization of ciliated AS in MS lesions, we mapped their expression profile to ST slides and found that they mapped strongly to the LC niche (Fig. 2e,f; Methods). We validated this prediction by single-molecule fluorescence in situ hybridization (smFISH) using probes against ADCY2 (AS marker) and SPAG17 (cilia marker). Indeed, we observed an increase of ADCY2-SGPAG17 coexpressing cells in the LC of MS-CA lesions, with a signal comparable with that of EP cells. (Fig. 2g and Extended Data Fig. 7g,h; Methods). Then, we tested SPAG17 at protein level using immunofluorescence (IF) and confirmed specificity for cilia-forming EP cells. Notably, SPAG17 did not overlap with GFAP and SMI32 staining in MS LC areas, demonstrating that the signal was unrelated to astroglial and axonal fibers (Fig. 2h,i; Methods). We then measured their length and found cilia up to 82 µm, substantially larger than those in EP cells (2.5 µm) (Fig. 2i,j).

After validating the existence of ciliated AS in MS scar tissue, we characterized their molecular function (Methods). Enrichment analysis of marker genes identified only pathways related to cilia biogenesis (Fig. 2k). Similarly, transcription factor activity inference based on target gene expression predicted FOXJ1, REST and TBX1 to be activated in ciliated AS (Fig. 2l). FOXJ1—crucial for motile cilia formation—regulates the expression of various target genes, including EZR and CETN2, which encode key ciliary components41. These genes, along with inferred FOXJ1 activity, were present in spots predicted to contain ciliated AS cells (Fig. 2m,n and Extended Data Fig. 7i).

In summary, these results confirm an AS subtype generating large cilia structures associated with glial scar in demyelinated cores of subcortical MS lesions.

Characterization of differences between MS lesion types

We next characterized the molecular and compositional variability across samples. We calculated patient distances by aggregating information from both snRNA-seq and ST, and projected samples into a two-dimensional space (Fig. 3a and Extended Data Fig. 8a–h; Methods). Silhouette scores determined that samples clustered based on their lesion type (Extended Data Fig. 8i; Methods), suggesting larger differences between than within patient groups.

Fig. 3: Differential cell-type-specific profiling of CTRL and MS lesion tissues as well as corresponding niches.
figure 3

a, MDS plot based on the aggregation of molecular and compositional differences across paired samples. Color indicates MS lesion type and control. Shape indicates biological sex. b, Cumulative number of DEGs per cell type across conditions. Color indicates condition as in a. c, Venn diagram visualizing overlap of aggregated DEGs between CTRL and MS lesion types. d, OL DEGs (left) and subtype composition (rigplot of enriched pathways for DEGst) associated with DEGs. e, AS DEGs (left) and subtype composition (right) associated with DEGs. f, MC DEGs (left) and subtype composition (right) associated with DEGs (n = 6 for CTRL, n = 6 for MS-CA, n = 4 for MS-CI). g, Cumulative number of DEGs per niche between MS lesion types. Color indicates MS lesion type as in a. h, Volcano plot of DEGs for LR and VI niches. Color indicates association to a MS lesion type as in a. i, Heatmap of pathway enrichment activities between niches. Color indicates association to a MS lesion type as in a. j, AS and MC compositional changes per individual niches between MS lesion types (n = 8 for MS-CA, n = 4 for MS-CI). Two-tailed Wilcoxon rank-sum test, BH-adjusted P < 0.10. Violin plots illustrate the median (white dot), 25th percentile (Q1) and 75th percentile (Q3), with the ends of the black box indicating Q1 and Q3; the whiskers extend to the furthest datapoint within 1.5× the IQR, specifically 1.5 × (Q3–Q1), beyond Q3 or below Q1.

For snRNA-seq, we performed differential expression analysis for each main cell type across CTRL, MS-CA and MS-CI (Supplementary Table 6; Methods), identifying condition-specific genes. OL, MC and AS had the most differentially expressed genes (DEGs), followed by OPC and EC (Fig. 3b and Extended Data Fig. 9a–c). Most dysregulated genes were specific to MS, with many shared between MS-CA and MS-CI (Fig. 3c). Despite the similarities, we identified differences between them, while CTRL tissue had a unique signature having minimal overlap with MS-CA and MS-CI.

We next quantified the differences in cell type and subtype abundances between CTRL and MS tissues using snRNA-seq (Supplementary Tables 8 and 9; Methods). At the cell type level, only TC expanded while OPC were reduced in MS tissues based on relative abundance analysis (Extended Data Fig. 9d). Moreover, OPC, OL, AS and MC cell subtypes showed strong changes in their abundances between CTRL and MS (Fig. 3d–f and Extended Data Fig. 9e). Therefore, we focused on the top three cell types (OL, AS, MC) expressing the strongest transcriptomic changes associated with the respective subtypes. CTRL-enriched OL genes originated from Homeo1 and Homeo3 subtypes, with fewer from Homeo2, whereas MS-enriched OL derived from Dis1 and Dis2 subtypes (Fig. 3d and Extended Data Figs. 2a and 8e). CTRL OL genes were related to differentiation (ERBB2, NDE1, CDK18), myelin production and maintenance (ADAMTS4, EPHB, ELOVL6). Conversely, MS-CA OL genes were related to inflammation (EIF5, NFKB2, IRF, CD274) and cell stress (ATF4, HSPB1, HSP90B1). In MS-CI, OL genes were related to tissue remodeling (TGFBR2), lipid regulation (LGALS3, MYRIP, MPZ, NGFR), cell differentiation (SOX4, GMFB) and cell stress (OSMR, SLC22A17, DCC, BRCA2). Further, disease-associated OL genes could be associated with ER stress as well as antigen processing and presentation pathways (Extended Data Fig. 9f).

For CTRL-enriched AS, genes originated from the Homeo subtype, and their predicted functions were linked to differentiation (PDK1) and homeostasis (TANGO2, ALDH1L1, NNT) (Figs. 2a and 3e). Conversely, both MS-enriched genes derived from React, Dis1 and Dis2 AS subtypes. In MS-CA, AS genes were linked to tissue inflammation (OSMR, HMGB1, HLA-F, SERPINA3, C3, TNFRSF1A, CLU, ITGB1), blood vessel flow function (SMAD6), anti-oxidation (NFE2L2) and endosomal functions (APP, LRP1, EEA1), suggesting roles in debris uptake and clearance. In both lesions, we found genes related to cilia motility functions (FOXJ1, CFAP97). MS-CI AS genes were associated with functions related to scar formation (ITGA2, COL6A1), debris clearance (AKT3, CTSD) and interferon signaling (ARL5B, IFITM3, IFITM2). Additionally, pathways involved in platelet adhesion and aggregation were upregulated in MS-CI AS (Extended Data Fig. 9f).

CTRL-enriched MC genes originated from Homeo1 and Homeo2 subtypes and were related to homeostatic MC functions (P2RY12, CX3CR1, FRMD4A) and synaptic maintenance (ARHGAP12) (Fig. 3f and Extended Data Figs. 2c and 9f). In contrast, MS-enriched genes derived from CA and Dis subtypes. MS-CA genes were related to MC activation (TRAF3, PARP9, SPP1), iron and lipoprotein metabolism (CD163, FTL, APOE, APP), complement activation and heparin binding (C1QB, C1QC, GPNMB), and cytoskeleton remodeling (PARVG, IQGAP1). MS-CI MC genes were linked to tissue remodeling and regulatory functions (ANXA2, LRRK2, TGFB1, AGPS), together with genes encoding proteins related to pattern recognition, cell motility and cell–cell signaling (CLEC7A, CLEC12A, ITGB1, PI3CG).

Next, we investigated changes in differential gene expression (DGE) across tissue niches and MS lesion types (Fig. 3g,h, Extended Data Fig. 9g and Supplementary Table 7; Methods). We observed that dysregulated VI genes originated mainly in MS-CA, together with the largest number of dysregulated genes deriving from LR and LC niches. (Fig. 3g). In MS-CA, LR and VI niches shared a lot of genes encoding for functions related to cell migration and invasion (FSCN1, LAP3), proliferation (PAG1, S100A9), inflammation (SPP1, C1QA, HSPH1), interferon signaling and tissue remodeling (TGFBR2, IFNGR2) (Fig. 3h). Additionally, inflammation and TGFβ signaling pathways, including loss of SMAD2/3 function, were enriched in LR and LC niches (Fig. 3i). Conversely, pathways enriched in MS-CI were related to tissue integrity, mitochondrial damage and cellular starvation.

The LC-enriched genes, which encode proteins with anti-inflammatory (PHLDA1, TGFB1) and tissue-regulatory functions (MAFB, VISG4, MERTK), were more enriched in MS-CA compared with MS-CI tissues (Extended Data Fig. 9g). Additionally, the LC niche shared genes with the VI niche that were related to myelination (PHGDH). In contrast, the PPWM-enriched genes, which encoded for proteins related to cell integrity (FAT1, ITGA7, PLEC), were more enriched in MS-CI tissues. Finally, we quantified the changes in abundance of the niches and associated cell types in the ST data (Fig. 3j and Supplementary Tables 8 and 9; Methods). Although niche abundances did not change between MS lesion types, AS were enriched in MS-CI, reflecting AS-mediated build-up of the glial scar in the chronic LC. Moreover, MC cells were enriched in the LR of MS-CA, suggesting sustained inflammation at the rim.

Taken together, these results help better characterize different lesion types by offering high-resolution spatial insights into cell types and gene expression changes across different niches.

Inference of differential cell–cell communication events

After quantifying the transcriptomic differences between tissue niches in MS lesions, we explored whether those genes were associated with cell–cell communication (CCC) events through ligand–receptor interactions. CCC inference from single-nuclei resolution is challenging and prone to generate false positive results42. To improve accuracy, we devised a computational approach using our paired snRNA-seq data, leveraging spatial information to increase the probability of CCC events, potentially reducing prediction errors (Methods). We computed differential ligand–receptor interactions across cell type pairs for each condition. For the significant interactions, we calculated spatially weighted local scores per spot in the ST data. Finally, only significant interactions where the ligand, the receptor or both were marker genes of a specific cell subtype that was significantly abundant in the same lesion type were kept for analysis (Fig. 4a).

Fig. 4: Inference of differential CCC events.
figure 4

a, Pipeline design for the inference and filtering of CCC events. First (left), DGE analysis results obtained from snRNA-seq were used to obtain significant cell type pair ligand–receptor interactions (BH-adjusted P < 0.15). Color scale indicates association with condition. Second (center), spatially weighted interaction local scores are computed for the selected interactions and tested for significance across lesion types (BH-adjusted P < 0.15). Violet to yellow scale indicates spatial weight; blue to red scales indicate magnitude of interaction scores. Third (right), interactions are only kept if their ligand or receptor was a marker gene for a MS lesion type and specific cell subtype. At each step, conflicting interactions were dropped to ensure robustness. GEX, gene expression. b, Cumulative number of differential interactions grouped by cell type across conditions. Color indicates condition. c, Venn diagram of overlapping interactions between MS lesion types and CTRL. d, Cell–cell network structure. Edge size indicates number of interactions. Green indicates healthy (CTRL) and purple disease (MS-CA and MS-CI) specificity. e, Top 30 significant interactions grouped by gene across conditions. f, Top 30 significant interactions per MS lesion types and CTRL across niches. Text color indicates sender (left) and receiver (right) cell types. g,h, Box plots of AS-encoded ligand HMGB1 and MC-encoded receptors CD163 (g) or TLR2 (h) between conditions (top left and center). Note box plot showing cell–cell interaction scores between ligand and receptor together with predicted ST mapping (top right). smFISH for ADCY2 (AS), CD163/TLR2 (MC) and HMGB1 (ligand). i,j, Box plots of MC-encoded ligand CD14 and both the EC-encoded (i) or AS-encoded (j) receptor ITGB1 (top left and center). Note box plot showing interaction scores between MC and EC/AS cells with predicted ST mapping. smFISH for CD14 (MC), ITGB1 (EC/AS) and VWF (EC) or ADCY2 (AS). Two-tailed Wald test for gene expression; BH-adjusted P < 0.05; n = 6 for CTRL, n = 6 for MS-CA, n = 4 for MS-CI; two-tailed Wilcoxon rank-sum test for interaction scores; BH-adjusted P < 0.10; n = 6 for CTRL, n = 8 for MS-CA, n = 4 for MS-CI. Scale bars, 20 µm. Box plots illustrate the median (white dot), 25th percentile (Q1) and 75th percentile (Q3), with the ends of the black box indicating Q1 and Q3; the whiskers extend to the furthest datapoint within 1.5× the IQR, specifically 1.5 × (Q3–Q1), beyond Q3 or below Q1.

Using our approach, 190 differential CCC events out of 2,724 were consistent across profiling technologies (Extended Data Fig. 10a and Supplementary Table 8). When grouped by cell type, AS, MC and OL had the highest number of differential interactions (Fig. 4b). Most AS and MC interactions were associated with MS-CA, while OL were linked mainly to CTRL. No interactions overlapped between CTRL and MS tissues, and few were shared between MS lesion types, suggesting condition-specific CCC events (Fig. 4c). After grouping interactions between CTRL and MS, we observed that cell-type-specific crosstalk was distinct between them (Fig. 4d). In CTRL, we found a quite heterogeneous interaction network formed by OL, OPC, AS, MC and EC, while in MS, networks were driven by immune cell types that seemed to interact with AS, SC and EC (Extended Data Fig. 10b–d). Next, when we grouped interactions at the gene level, we observed that most ligands or receptors were specific to a condition (CTRL, MS-CA or MS-CI) (Fig. 4e and Extended Data Fig. 10e). Furthermore, the top identified interactions exhibited tissue niche specificity: CTRL interactions mapped to WM, MS-CA to LR and MS-CI to both LR and VI niches (Fig. 4f).

Given the importance of the inflamed rim in MS, we focused on the LR niche and validated five MS-CA-specific interactions (Methods), one of which is the interaction between HMGB1 and CD163/TLR2. HMGB1 is a secreted DAMP protein that regulates inflammation-related pathways depending on the receptor (for example, CD163 (ref. 43) or TLR2 (ref. 44)). In our data, we predicted CCC events mediated by HMGB1 encoded by AS, NEU, EC and OL cells, binding to CD163 and TLR2 encoded by MC subtypes (Fig. 4f and Extended data Fig. 10e). In MS-CA, these interactions were most prominent between AS and MC, and mapped to the inflamed LR niche (Fig. 4g,h). Using smFISH, we validated those CCC events by observing ADCY2/HMGB1-expressing AS in close proximity to CD163- or TLR2-expressing MC subtypes, specific to LR regions in MS-CA lesions. (Extended data Fig. 10f).

We then focused on the CD14–ITGB1 ligand–receptor pair identified through CCC gene expression analysis. CD14 encodes a well-characterized MC receptor that mediates innate immune responses. It can also act as a surface ligand, although its functions are less extensively documented45. ITGB1, also known as CD29, encodes an integrin that binds to CD14 and was found to be upregulated in EC, AS and MC subtypes (Fig. 4i,j and Extended data Fig. 10g). CD14–ITGB1 interactions between MC, AS and EC subtypes were predicted to map to LR niches in MS-CA samples. Validation with smFISH revealed ADCY2/ITGB1-expressing AS, VWF/ITGB1-expressing EC and CD14/ITGB1-expressing MC cells to be in close proximity to CD14-expressing MC cells at the rims of MS lesions, but absent in CTRL samples (Extended data Fig. 10g,h).

Finally, we examined the pathways associated with the validated CCC networks (HMGB1-CD163/TLR2 and CD14/ITGB1). Across all cell–cell specific interactions, we found broad enrichment of pathways that aligned with classic MC functions such as lipid degradation during phagocytosis and complement factor production (Extended data Fig. 10i; Methods).

In summary, unsupervised analysis and validation of CCC events in spatially restricted MS niches can help identify disease-relevant targets.

Discussion

Previous tools for assessing MS pathology have focused on conventional histopathological techniques10,11,24, which are limited by the number of antibodies to distinguish cell types. Recent advancements in transcriptomics and proteomics1,5,23,46,47,48 have enhanced our understanding of cell-type-specific gene expression and dysregulation, although with limitations. Combining unsupervised single-cell computational tools with multiplex imaging techniques would offer a holistic approach to deep tissue profiling in MS pathology. To achieve this, we generated a paired snRNA-seq and ST dataset, enabling the study of tissue niches, cytoarchitecture and CCC. This combined approach permitted us to generate a high-resolution transcriptomic atlas of subcortical WM in CTRL and MS lesions. In this study, we focused on the composition of the main cell types and their subtypes, highlighting tissue changes between inflamed CA and noninflamed CI lesions, especially at the LR.

Using gene expression data from the main cell types, we estimated the composition of each ST tissue spot, deriving unsupervised spatial niche annotations that substantially overlapped with histopathologically annotated lesion and nonlesion areas. Through an unsupervised multiview factor analysis, we identified four distinct niches unique to MS: VI, LC, LR and PPWM. The VI niche, located primarily within the LC, contained EC, SC and immune cells, resembling lesion-associated perivascular spaces. Of note, the expansion of VI spaces in MS is compatible with aspects of chronic perivascular immune cell cuffing and tissue infiltration, indicating chronic blood–brain impairment and smoldering tissue inflammation4,49,50. Sustained blood–brain barrier leakage and fibrinogen deposition are known pathological features of MS and experimental autoimmune encephalomyelitis51,52, which can trigger perivascular MC activation52. Therefore, our analysis highlights perivascular spaces and their cellular components as a key tissue niche in MS.

Further, our approach enabled us to distinguish CTRL tissue from both MS-CA and MS-CI lesion types at the transcriptomic level. We identified differences between early MS-CA, characterized by an inflamed LR, and late MS-CI lesions, which enabled us to track gene expression changes during lesion progression. For example, within the OL lineage, we observed an appearance of genes linked to peripheral myelin synthesis, such as MPZ and NGFR during lesion development. Evidence for peripheral myelin synthesis and presence of Schwann cells have been reported in human MS53 and experimental demyelinating lesions53,54, suggesting potential roles of peripheral myelin in WM repair55. These results demonstrate that unsupervised computational approaches can reveal biologically relevant targets previously identified in MS and related neuroinflammatory models.

Focusing on AS, a key finding was the identification of a long cilia-forming subtype mapped to the core of chronic MS lesions, suggesting a complex level of astroglial diversity associated with the glial scar. These cilia were longer than those in EP cells, although their functional implications remain unclear. AS with abnormal motile cilia gene expression have been reported previously in the mouse cortex after inducing mitochondrial dysfunction via Twnk loss-of-function56. It is known that mitochondrial dysfunction and metabolic exhaustion are known features of chronic MS lesions and have also been reported for reactive AS in MS57,58. Further, recent reports investigating glial changes in spinal cord injury and stroke models in mice59 and rats60 also demonstrated the presence of scar-associated AS subtype cells with cilia-associated genes. We validated those cilia-associated AS genes at both RNA and protein level and could confirm the presence of elongated cilia as well as AS cell type specificity. Although the in vivo function of ciliated AS remains elusive, we hypothesize that their location and lack of a proinflammatory phenotype suggest a role in tissue remodeling and chronic scar formation61.

In the last part of the study, we focused on CCC and developed a computational method to analyze ligand–receptor interactions between CTRL tissue and MS lesions. Our approach considered only interactions with significant changes across conditions in both data modalities, ensuring more robust predictions than using only one. For example, we identified interactions that had been characterized previously, such as between AS-derived HMGB1 and MC receptors CD163/TLR2. Of note, CD163—an iron scavenger receptor—is upregulated in MC subtypes at inflamed MS-CA rims22. In addition, MC-encoded CD163 may also mediate sensing of cell stress and damage-associated molecules such as HMGB1 (ref. 62). We also identified interactions not previously characterized in MS, such as the one between the MC ligand CD14 and the MC/AS/EC-derived integrin ITGB1, which also mapped to the inflamed MS-CA rim. Previous studies reported critical roles of ITGB1 in chemotaxis of blood derived MC45, blood vessel-guided migration of AS progenitors63 and in the stability64 and disruption of blood vessel integrity through EC subtypes65. Thus, this glial–immune–vascular interaction might have specific roles in blood vessel remodeling and immune cell trafficking at inflamed MS LRs. These findings, along with all other predicted CCC events, illustrate how computational methods can be used to identify disease-specific interactions and link them to specific cell subtypes and tissue niches in MS.

In summary, we generated and analyzed paired snRNA-seq and ST data to reveal the complex tissue microenvironment underlying MS lesion progression, focusing on the rim area and identifying cell-type-specific and spatially restricted drivers of pathology. While future research should examine different lesion types in individual patients, our results enhance understanding of the molecular cytoarchitecture of the subcortical WM across MS lesion types. As our approach is highly driven by computational prediction models in combination with in situ validation on RNA and protein level, future work will be necessary to functionally validate the findings obtained.

Methods

Postmortem human tissue samples

In total, we examined 15 snap-frozen tissue blocks, obtained from autopsies from six people diagnosed with MS and seven tissue donors without recognizable neuropathological changes (CTRL). All tissue samples used in this work were provided by the UK Multiple Sclerosis Tissue Bank at Imperial College London, after ethical approval by the National Research Ethics Committee in the UK (08/MRE09/31). Further information about the donors is provided in Supplementary Table 1. No statistical methods were used to predetermine sample size but our sample size is similar to those reported in previous publications, for example Absinta et al.5 and Schirmer et al.23. Data collection and analysis were not performed blind to the conditions of the experiments.

IHC and histochemistry

Snap-frozen tissue blocks were dissected into 16-µm-thick sections using a Leica Microsystems CM3050S cryostat, placed on VWR superfrost plus microscope slides and stored at −80 °C. Histopathological assessment was performed using IHC for CD68 and CD163 as described previously23. The following antibodies were used: mouse anti-CD68 (cat. no. 333802, 1:200, Biolegend), and mouse anti-CD163 (cat. no. NCL-L-CD163, 1:1,000, Novocastra).

For chromogenic CD163 IHC and histochemistry, tissue sections were fixed on slides by thawing and drying, followed by immersion in acetone for 10 min at 4 °C. Afterwards, slides were allowed to dry at room temperature. CD163 was stained with an Autostainer Link 48 by Dako. Endogenous peroxidase was blocked using a ready-to-use peroxidase-blocking solution (cat. no. S202386-2, Dako). The primary antibody was diluted in antibody diluent (cat. no. S080983-2, Dako) and applied for 60 min at room temperature. After a washing step, slides were exposed to a mouse-specific biotinylated secondary antibody (cat. no. GV82111-2, Dako) for 15 min, followed by incubation with streptavidin-linked horseradish peroxidase (cat. no. SM802, Dako) for 20 min. The staining was developed using a 1:50 dilution of 3,3′-diaminobenzidine (DAB) chromogen in DAB+ substrate buffer (cat. no. GV825, Dako) for 10 min. Sections were counterstained using hematoxylin and eosin (H&E) and coverslipped.

Sections were stained for myelin with LFB by incubation with 0.1% LFB at 56 °C overnight. After washing with 96% ethanol and rehydration, the slides were immersed in 0.1% aqueous lithium-carbonate solution for 5 min. The staining was differentiated in 70% ethanol until the myelin sheaths obtained an intense blue color. Subsequently, tissue slides were washed with distilled water, counterstained with periodic acid-Schiff and, following dehydration, coverslipped with a mounting medium (cat. no. 03989, Merck).

To localize ferrous and ferric iron, DAB-enhanced Turnbull Blue (TBB) was applied as described previously66. In short, the slides were dried and exposed to 10% ammonium sulfide solution (cat. no. 105442, Merck) in double-distilled water for 90 min. Consecutively, slides were immersed in 10% potassium ferricyanide and 0.5% hydrogen chloride in an aqueous solution for 15 min. This step was followed by blocking the endogenous peroxidase with 0.01 M sodium azide and 0.3% hydrogen peroxide in methanol for 60 min. The slides were washed with 0.1 M Sorensen’s phosphate buffer and the staining was developed with a 1:50 solution of DAB chromogen (cat. no. K3468, Dako) and 0.005% hydrogen peroxide in Sorensen’s phosphate buffer for 20 min. Slides were counterstained with H&E and coverslipped.

For IF staining, sections were fixed in ice-cold methanol, 4% paraformaldehyde or acetone at room temperature for 10 min, then blocked with PBS-T (PBS 1×, Triton 0,1%) and goat serum (cat. no. 16210-064, Gibco, 1:10) for 30 min. Primary antibodies were diluted in PBS-T or Intercept-TBS (cat. no. 927-60001, LI-COR) and then incubated overnight at 4 °C. The slides were washed twice with PBS for 5 min and then incubated with secondary antibodies diluted in PBS-T for 2 h. Slides were then washed twice with PBS for 5 min and mounted using Fluoromount-G with 4,6-diamidino-2-phenylindole (DAPI) (cat. no. 00-4959-52, ThermoFisher Scientific). The following primary antibodies were used: rabbit anti-SPAG17 (cat. no. PA5- 55912, ThermoFisher, 1:100), mouse anti-SMI32 (cat. no. 801701, Biolegend, 1:2,500), rat anti-GFAP (clone 2.2B10, cat. no. 13-0300, ThermoFisher, 1:200), rabbit anti-CD3 (clone CD3-12, cat. no. MCA1477, Bio-Rad, 1:200), rabbit anti-CD11B (clone EPR1344, cat. no. ab133335, Abcam, 1:500) and mouse anti-CD19 (clone HIB19, cat. no. 302202, Biolegend, 1:50). Separate slides were stained only with secondary antibody, to be able to identify background signals.

Fluorescence multiplex in situ RNA hybridization

For smFISH validation, we processed cryosections of 14 µm thickness. smFISH was performed on a representative selection of CTRL and MS tissue samples using ACD RNAscope 2.5 HD Red and Multiplex Fluorescent v.2 assays (ACD Biotecne), following the protocol of a previous publication1. The following human RNAscope assay probes were used: HMGB1 (C1), SPAG17 (C1), CD14 (C1), CD163 (C2), TLR2 (C2), ITGB1 (C2), ADCY2 (C3) and VWF (C3). For multiplex ISH, probes were labeled with TSA Vivid Fluorophores (Fluorescein, Cyanine 3, Cyanine 5, Akoya Biosciences). Slides with positive ISH and human ACD bio-techne 3-plex negative probes, were used in every run as quality control.

Image acquisition and quantification

Brightfield images of CD163, CD68, TBB (iron) and LFB were acquired using a Leica DM6 B microscope with Leica K3C camera at ×20 magnification. Pictures were imaged with the Leica Application Suite X (LAS X) software (v.3.8.1.26810) and exported as TIFF files and later processed using ImageJ (v.2-2.14.0) software. Images were also acquired using Hamamatsu NanoZoomer 2.0HT at ×40 magnification and exported as NPD files. Image processing of histological data was performed using GIMP-v.2.10 software.

Multiplexed fluorescent images were taken using a Leica DM6 B microscope with a Leica K5 camera, and preprocessed with a thunder imaging system. Focus points were set at ×20 magnification for overview and quantification purposes. All fluorescent pictures are z-stack images consisting of 10–20 layers with a 0.7-μm step size, in all four channels. Pictures were imaged with the LAS X software, exported as LIF files and analyzed with QUPATH (v.0.4.3)67. First, cells were detected on the DAPI channel. Cell expansion was set at 2 µm, and subcellular spot detection was run for ADCY2 and SPAG17. Double positive cells were classified using a composite classifier on the estimated subcellular spots of ADCY2 and SPAG17. Quantification was performed for LC, LR and PPWM lesion areas.

Histopathological assessment and lesion type characterization

We focused on subcortical WM specimens for analysis; however, in certain cases, small adjacent GM areas have been captured during tissue sectioning. WM lesions were identified as areas with a marked loss of myelin staining visualized through LFB-periodic acid-Schiff histochemistry. We further classified lesions by IHC using antibodies against CD68, CD163 to detect activated MC and TBB to detect iron at LRs. Lesions were then classified into MS-CA or MS-CI according to established criteria12. Notably, fully remyelinated lesions, so-called shadow plaques, were not included. Staging of lesion types was performed by a trained neuropathologist specializing in MS pathology. Chronic active lesions showed a hypocellular, demyelinated lesion center but a distinct inflamed rim with presence of CD68- and CD163-positive cells, regularly containing LFB-positive myelin degradation products. Some of these lesions showed accumulation of iron-laden microglia and macrophages at the rim. Chronic inactive lesions were characterized by a fully demyelinated, hypocellular lesion center, a low frequency of CD68- and CD163-positive macrophages or microglia within lesions and a distinct rim without accumulation of CD68- or CD163-positive cells.

Sample selection for transcriptomics

The RNA integration number (RIN) was used as a sample selection criteria, and only samples with a value of ≥5.9 were included for both transcriptomic analysis. We cut 70-µm-thick sections of tissue on a Leica Microsystems CM3050S cryostat to obtain a final weight of 15 mg of tissue, from which the RNA was isolated. This was done using TRIzol (cat. no. 15596026, ThermoFisher), chloroform (cat. no. 1731042, Sigma Aldrich) and RNeasy mini kit (cat. no. 74104, QIAGEN) following the manufacturer’s recommendations. RNA integrity was measured on an Agilent 2100 Bioanalyzer using the High Sensitivity RNA ScreenTape (cat. no. 5067-5579, Agilent), buffer (cat. no. 5067-5580, Agilent) and ladder (cat. no. 5067-5581, Agilent) according to the manufacturer’s instructions.

Nuclei isolation, library preparation and ambient mRNA correction

Nuclei from selected samples were isolated using sucrose-gradient ultracentrifugation according to established workflows23. Following isolation, nuclei were diluted to a final concentration of 1,000 nuclei per microliter and loaded to the 10x Genomics Chromium controller aiming for a recovery rate of 8,000 nuclei per sample. We prepared the libraries following the 10x Genomics protocol using the 3′ single cell v.3.1 kit (cat. no. PN 1000121) with single indexing. Samples were sequenced on a NovaSeq 6000 aiming for a sequencing depth of 30,000 reads per nucleus. Expression count matrices for each sample were generated using Cell Ranger Count (v.6.0.2) by performing alignment to the sequencing data against the GRCh38-2020-A reference transcriptome. Then, the obtained count matrices were corrected for ambient mRNA using CellBender (v.0.2.0)68 with the following parameters: model = full; expected_cells = 8,000; total_droplets_included = 50,000; fpr = 0.01; epochs = 150; posterior_batch_size = 5; cells_posterior_reg_calc = 50.

snRNA-seq data quality control

The data processing and downstream analyses for the snRNA-seq datasets was done using the scanpy toolkit (v.1.9.6)69 in Python (v.3.10.0). Each sample was filtered separately to control for batch differences. Single nuclei were filtered by genes (<200 genes), mitochondrial genes (percentage of mitochondrial genes <5%) and gene counts (number counts < number counts 99th percentile). Genes were kept if they were expressed across different nuclei (number of expressed nuclei >3). Afterwards, nuclei were filtered by doublet scores computed with scrublet70 (doublet score <0.1). Finally, each nuclei raw expression was normalized by the median of total counts (target_sum = None) and log-transformed (log1p).

snRNA-seq data integration and cell annotation

A single AnnData object was generated by concatenating (join=outer) all preprocessed nuclei coming from different samples. Feature selection was performed by computing high variable genes per sample and then selecting the top 4,096 genes that were flagged as variable in the maximum number of samples. Genes were then scaled across nuclei and principal component analysis (PCA) was calculated on the selected features. Harmony-py (v.0.0.9)71 was used to integrate the obtained PCs, eliminating batch effects between samples. Nearest neighbors were generated for nuclei by estimating similarities in the PC space (n_pcs = 50). The connectivities obtained were used to generate a uniform manifold approximation and projection (UMAP) manifold. Nuclei were clustered using the Leiden graph-clustering method72 (resolution = 0.25) and annotated manually using brain gene markers5,23 (Supplementary Table 3).

Comparison with an independent atlas

To validate cell type annotation, we compared our generated atlas with another reference human snRNA-seq atlas at the molecular level5. The count matrix and annotation metadata were downloaded from GSE180759. Pseudobulk transcriptomic profiles were generated for both atlases at the cell type level using decoupler-py (v.1.5.1, commit b6b430e)73. Then, genes were filtered based on the following hyperparameters: min_count = 10, min_total_count = 15, min_prop = 0.2, min_smpls = 2. To make them comparable, we filtered the profiles by the intersection of genes between the two atlases and log-normalized them with scanpy (target_sum = None). Finally, the Pearson correlation between the different profiles was performed and the P value adjusted by Benjamini–Hochberg (BH) correction (BH-adjusted P < 0.05; r > 0.75).

Identification and characterization of cell subtypes

To identify cell subtypes of principal cell lineages, the main cell types were subsetted from the entire snRNA-seq atlas. Genes with enough expression across cell types were kept (number of expressed nuclei >3), and samples with not enough cells for a particular cell type were removed (number of cells ≤5). PCA was then performed on the scaled log-transformed most variable genes across as many samples as possible. Depending on the number of cells, different numbers of variable genes were used: 4,096 genes (number of cells >1 × 104), 2,048 genes (number of cells >1 × 103) and 1,024 genes (number of cells <1 × 103). PCs were integrated using harmony-py71, nearest neighbors were computed per nucleus by finding similarities in the corrected PC space (n_pcs = 50) and the obtained connectivities were used to generate a UMAP manifold and clustering with the Leiden algorithm (resolution = 1.0). Clusters were characterized by identifying marker genes using the rank_genes_groups (method = t-test_overstim_var) function from scanpy with the log-transformed counts (adjusted P < 0.05; abs(log2FC) > 0.5). Finally, clusters were manually annotated based on marker genes (Supplementary Table 4). Clusters with no clear molecular profiles were removed (denoted as NA).

ST workflow

The 10x Genomics Visium Spatial Gene Expression platform was used for the ST experiments. Tissue samples (RIN ≥ 5.9) were cut into 10 µm sections using a Leica CM3050 S cryostat and placed into a Spatial Gene Expression Slides (cat. no. PN-1000185, region of interest (ROI) 6.5 × 6.5 mm) that was precooled inside the cryostat at −22 °C. The slides were stored in a container at −80 °C until further processing. The sections were then fixed and stained using protocol CG000160 Rev B. The sections were then imaged for a general morphological analysis and for future spatial alignment of the data using a ×10 lens equipped to a Leica DMi8 microscope and processed by LAS X.

Next, slides were permeabilized enzymatically for 18 min. This time was assessed using the 10x Visium Tissue Optimization kit (PN-1000191) and following the protocol CG000238 Rev D. The generation of the libraries was performed according to published protocols (10x Genomics): CG000239 Rev D, using the Gene Expression Reagent kit (cat. no. PN-1000186), the Library Construction Kit (cat. no. PN-1000190) and the Dual Index Plate TT Set A (cat. no. PN-1000215). To assess the correct amplification of obtained cDNA, QuantStudio 3 from ThermoFisher was used. For full length of cDNA and indexed libraries analysis, the TapeStation 4200 analyzer (Agilent) was used. The libraries were loaded at 300 pM and sequenced on a NovaSeq 6000 system (Illumina) with a sequencing depth of 250 million reads per sample.

The data were subjected to demultiplexing using SpaceRanger software (10x, v.2.0.0), creating FASTQ files. These files were then used by SpaceRanger count to perform alignment with the human reference genome GRCh38-2020-A, tissue detection, fiducial detection and barcode/unique molecular identifier counting generating a spatial gene count matrix per slide.

ST data annotation and quality control

First, spatial coordinates were annotated manually by a trained neuropathologist based on myelin (MOG, MBP and PLP1), myeloid cell (CD68) and astrocyte (AQP4, and GFAP) marker staining into different area types using the Loupe Browser v.6.3.0 software (10x Genomics). For each slide, spots were filtered by genes (<200 genes) and genes were kept if they were expressed across different spots (number of expressed spots >3) using scanpy. Finally, raw expression of each spot was normalized by the median of total counts (target_sum = none) and log-transformed.

The lesion area was subdivided into distinct zones. The LC or core covers the demyelinated area surrounded by the LR, which is the distinct border between the demyelinated lesion center and the surrounding myelinated WM, called PPWM. Annotations and delineations of lesion areas were carried out according to a previous publication22.

ST cell type deconvolution

The cell2location (v.0.1.3)74 package was used to calculate cell type abundances for each spot. Based on the annotated snRNA-seq atlas, reference expression signatures of main cell types were inferred leveraging regularized negative binomial regressions. Then, each slide was deconvoluted using hierarchical Bayesian models with the following hyperparameters: N_cells_per_location = 5 and detection_alpha = 20. Afterwards, cell type proportions were calculated per spot by dividing the abundance of a given cell type by the total sum of abundances of a given spot. To check how well the deconvolution worked, we compared the estimated cell type proportions from each slide with the actual cell type proportions in its matching snRNA-seq data. We did this by calculating the Pearson correlation at both the sample and cell type levels. Additionally, spatial activity scores were computed using the enrichment method ULM from decoupler with the REACTOME73,75 pathway genesets (bandwidth = 150).

Generation, characterization and validation of niches in ST data

To annotate niches in an unsupervised way across spots, gene expression and cell type compositions were first integrated into a multiview factor model using MOFA+ (v.0.7.0)26 for each slide independently. Specifically, cell type proportions were transformed to centered log ratios76 using the clr function of the composition-stats Python package (v.2.0.0), while gene expression was summarized into 50 principal components by computing PCA on the scaled top 4,096 variable genes. These two matrices were treated as different views in the MOFA+ model to generate 15 latent factors with the following parameters: scale_views = False, center_groups = False, spikeslab_weights = False, ard_weights = False, ard_factors = False. Then, nearest neighbors were generated per spot by estimating similarities in the latent factor space. The obtained connectivities were used to generate a UMAP manifold and to cluster spots using the Leiden algorithm (resolution = 1.0). Clusters were then annotated manually into niches based on cell type presence and pathway activities of the hallmark genesets77 generated with the enrichment method ULM from decoupler73.

To identify marker genes of niches, slides were concatenated into a single AnnData object and the scanpy’s function rank_genes_groups (method = t-test_overstim_var) was used. Additionally, to identify characteristic cell types per niche, cell type proportions were averaged per slide and then tested for differences against the rest (Wilcoxon rank-sum test, adjusted P < 0.05).

To assess the level of overlap between the computationally annotated niches and the ones annotated by a pathologist, the Jaccard index and the ARI were computed for each slide, ignoring categories that were not shared between annotations. Additionally, similarities between intra and interniche spots were computed using the Pearson correlation at the pseudobulked gene expression and mean clr-transformed cell type proportions.

Neighborhood analysis of spatial niches was performed by computing local spatial proportions. For each slide, a binary matrix (spots as rows and niches as features) was generated based on the obtained niche annotations. Niche binary values per spot were weighted spatially by averaging neighboring spots using an L1-norm Gaussian kernel (bandwidth = 150), obtaining spatial proportions. Values were then averaged per niche across slides.

Spatial trajectory analysis across niches

Slides were concatenated and pseudobulk profiles were generated for each niche and slide combination using decoupler73. Genes were kept following hyperparameters: group = None, min_count > 10, min_total_count > 15. The resulting gene profiles were log-transformed and normalized (target_sum = 1 × 104). The Spearman correlation was then computed for each gene based on the following niche order: WM, PPWM, LR, LC, VI. The obtained gene correlation statistics were then used to infer pathway activities that change differentially across the trajectory from the PROGENy78 resource using the ULM method. Significant pathways (P < 0.05) were then also computed for the pseudobulked gene expression profiles. Finally, the correlated genes, separately for positive and negative, were tested for enrichment for genesets from the REACTOME75 collection using the method ORA from decoupler73. Genesets containing the words FETAL, INFECTION or SARS were removed before computing the enrichment.

Characterization of ciliated AS

Samples and slides containing Ependym, MS11 and MS12 were removed from the analysis due to the similarity of EP cells to ciliated AS. A ciliated AS geneset was obtained by filtering the obtained marker genes with scanpy from the filtered atlas (method = t-test_overestim_var, adjusted P < 1 × 10−4, log2FC > 1). Spots containing AS (proportion > 0.25) were predicted to contain ciliated AS by computing enrichment scores using the obtained geneset with the ULM method from decoupler (weight = None, P < 0.05, score > 0)73. Due to differences in number of spots per slide, a bootstrap strategy was employed to test in which niches ciliated AS are found. For each niche, the total number of spots predicted to contain ciliated AS was used as the estimate. Then, 1,000 random permutations were performed, recording whenever the sampled ciliated AS were higher than the original estimate, to obtain an empirical P value per niche (adjusted P < 0.05). For enrichment analysis, the ciliated AS geneset was tested for enrichment against the REACTOME collection using the ORA method from decoupler (adjusted P < 0.05)73. Genesets containing the words FETAL, INFECTION or SARS were removed before computing the enrichment. Transcription factor activity was computed at both the nuclei and spot level using the CollecTRI79 resource with the method ULM from decoupler73.

Multidimensional scaling of samples

Multidimensional scaling (MDS) implemented in the sklearn (v.1.4.0) Python package was used to capture and visualize sample differences at different levels: cell type proportions, cell subtype proportions, deconvoluted cell type proportions, cell type gene expression and tissue niche gene expression. Cell type, cell subtype and deconvoluted cell type proportions were log-ratio transformed using the clr function of the composition-stats Python package. For gene expression, latent factors capturing shared gene programs between cell types or niches were inferred using the package MOFAcellulaR80 for snRNA-seq and ST samples, respectively.

For snRNA-seq data, we inferred four multicellular factors using multicellular factor analysis80 with MOFA+26 on the pseudobulk expression profiles of each cell type and sample. Each pseudobulk profile was built by summing the gene counts of all cells belonging to a cell type and a sample. Pseudobulk profiles built from at least ten cells were kept in the analysis. Within each cell type, genes expressed in at least 25% of the samples were kept for the analysis. A gene was considered to be expressed in a sample if at least 100 counts were identified. Samples within a cell type with a gene coverage of <90% were excluded. Cell types with fewer than ten samples or greater than 50 genes were not included in the analysis.

For ST data, we inferred three factors using multicellular factor analysis on the pseudobulk expression profiles of each niche and disease sample. Pseudobulk profiles were filtered as described above, except that niches with fewer than nine samples were excluded.

Distances were then computed between samples for each different level by computing the following term: distance = 1 − corr(x, y), where corr is the Pearson correlation coefficient between the vector values of sample x with the vector values of sample y, generating a distance matrix between samples of the same level. Each distance matrix was used to generate a level-specific MDS. Distances were summed into a single matrix, keeping only samples with paired snRNA-seq and ST data since unpaired ones have missing values. This cumulative distance matrix was also used to perform joint MDS summarizing the differences across levels. Finally, silhouette coefficient as implemented in sklearn was computed from each distance matrix at the sample level to quantify the clustering of lesion types.

Differential expression analysis between lesion types

For snRNA-seq, data were pseudobulked per cell type and sample using the function get_pseudobulk from decoupler (min_cells > 10, min_counts > 1,000)73. For each cell type, low expressed genes were removed using the function filter_by_expr from decoupler (group = Lesion type, min_count > 10, min_total_count > 15). Genes were tested to be differentially expressed using PyDESeq2 (v.0.3.5). The covariates lesion type and biological sex were used as design factors, and the contrast was performed between different pairwise lesion types: MS-CA versus CTRL, MS-CI versus CTRL and MS-CA versus MS-CI (cooks_filter = False, independent_filter = False). Contrasts were performed only if enough replicates were available for a particular cell type (min samples > 2).

For ST data, MS-CA and MS-CI typed MS slides were pseudobulked per niche and slide, low expressed genes were filtered, and genes were tested for differential expression per niche between the two lesion types using the same approach described above.

Enrichment analysis of differential expressed genes

For snRNA-seq data, genes were filtered by significance (adjusted P < 0.05, abs(log2FC) > 1) and assigned to a lesion type based on the sign of the gene-level statistic. Because genes can be assigned to a lesion type from the multiple contrasts generated in the last section, they were made unique, generating a unique list per lesion type. Enrichment analysis was performed for each cell type and contrast using the REACTOME geneset collection with the method ORA from decoupler73. Genesets containing the words FETAL, INFECTION or SARS were removed before computing the enrichment.

For ST data, all available gene-level contrast statistics between MS-CA and MS-CI per niche were used as input for the method ULM from decoupler73 to compute differential activity scores across niches using the same REACTOME geneset collection.

Compositional data analysis

Cell type compositions in snRNA-seq data were computed per sample by summing the number of cells per cell type, dividing by the total number of cells and log-ratio transforming the obtained proportions using the clr function of the composition-stats Python package81. In this calculation, SC and BC were removed to make compositions as comparable as possible since they were missing in most CTRL samples. NEU were also removed due to their presence caused by the nature of the tissue sampled rather than the lesion type. Cell subtype compositions in snRNA-seq data were computed per sample and cell type in the same fashion.

Niche compositions in ST data were computed per sample by summing the number of spots per niche, divided by the total number of spots in the slide and log-ratio transforming the obtained proportions. In this calculation, GM and EP were removed due tissue sampling reasons as described above. Cell type compositions per niche and slide were computed by summing the deconvoluted cell type abundances per cell type across spots of each niche, divided by the total sum of abundances for the niche. The obtained proportions were log-ratio transformed. Cell type compositions per slide were computed by summing all deconvoluted cell type abundances per cell type, divided by the total sum of abundances across the whole slide and log-ratio transforming the obtained proportions. For the two latter calculations, NEU were removed as described above.

The Kruskal–Wallis test was used to test for differences in each compositional type (adjusted P < 0.05). For significant elements, the Wilcoxon rank-sum test was used to test for pairwise differences between lesion types (adjusted P < 0.10).

Differential CCC inference

CCC inference was performed by combining results from both snRNA-seq and ST datasets to reduce as much as possible the number of false positive inferred interactions. Differential ligand–receptor interactions between the three lesion type contrasts across cell type pairs were inferred using the consensus ligand–receptor database from LIANA+ (v.1.0.1)42,82. For each contrast, LIANA+ was used to compute a differential interaction score between a ligand of cell type A and a receptor of cell type B as the mean value between their differential gene statistics, generating lesion type-specific cellA–cellB–ligand–receptor interaction tetramers. Interactions with conflicting signs between ligand–receptor or with both genes not being significant (BH-adjusted P < 0.15) were not considered.

Due to the low number of replicates for BC, TC and SC for differential expression analysis, a separate strategy was employed to infer their communications events. For each contrast, snRNA-seq data was filtered for the lesion type being tested keeping only genes that were expressed in at least 5% of cells, and the method rank aggregate from LIANA+ was used to infer CCC scores. Interactions that belonged to any of the three cell types and were significant were kept (BH-adjusted P < 0.15).

For the remaining significant interactions, spatially informed local scores were inferred across slides using a custom multivariate version of the normalized product method from LIANA+. In this approach, cell type proportions and gene expression were binarized for each slide (proportion > (1/No. of cell types); log-normalized expression >0). Feature values per spot were then spatially weighted by averaging neighboring spots using an L1-norm Gaussian kernel (bandwidth = 150). To make features comparable, they were divided by their maximum value, bounding their values between 0 and 1. The local score for a given cellA–cellB–ligand–receptor interaction was computed for each spot by the following formula:

$${\rm{score}}={{C}}_{{\rm{A}}}\times {{C}}_{{\rm{B}}}\times {L}\times {R}$$

where CA and CB are the normalized spatially weighted cell type proportions of cell type A and B, and L and R are the normalized spatially weighted gene expression values of the ligand and receptor genes. Differences of interaction scores between lesion types were tested using the Wilcoxon rank-sum test. Significant interactions with no conflicting sign with the scores obtained in snRNA-seq were selected (BH-adjusted P < 0.15).

Candidate interactions were further filtered based on the results of cell subtype compositional changes and cell subtype marker gene described in the previous sections. Interactions were kept only if the ligand or the receptor was a marker gene for at least one cell subtype that significantly changes its abundance in the corresponding lesion type (BH-adjusted P < 0.15).

Statistics and reproducibility

For most of the manuscript nonparametric tests were used. In the cases when parametric tests were employed, data distribution was assumed to be normal but this was not formally tested. The experiments were not randomized. All IHC (CD163, CD68) stainings, together with iron and LFB stainings were performed on all samples included in the present study with Fig. 1a (right panel) showing representative images of the three conditions. Figure 2h IF for EP cell characterization could only be performed in samples that presented these cell types (n = 2). Figure 2i,j shows representative images of IF stainings done in LCs of MS-CA samples. Extended Data Fig. 5f shows representative images from IF stainings performed in several VI ROIs of MS samples. Extended Data Fig. 9f,h shows representative images of CCC events in CTRL samples to prove that no interaction is present.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.