Main

With the rise of single-cell and spatial omic methodologies, our capacity to describe cell type composition and developmental trajectories has advanced substantially. Omic atlases continue to expand our understanding of the different factors influencing cell identity and positioning in both embryos and adult organisms1,2,3,4,5,6.

Currently, compositional atlases rely on the ease of detection of nucleic acids and assume that transcriptional profiles are accurate descriptors of cell types and tissue structures5. However, cell identity is not solely determined by gene expression, and organs are structured toward a metabolic division of labor7,8. Therefore, mapping detailed biochemical compositions of tissues and organs can reveal an important axis of cell state heterogeneity and tissue structure.

To understand the biochemical tapestry of cells and tissues, a focus on lipids promises substantial advances. Fundamental cellular processes rely on lipids9,10,11,12, which also play roles in cell communication, contributing to organism physiology regulation7,13,14. Moreover, we recently showed that single-cell lipid configurations drive cell states involved in tissue patterning15. During development, lipids display complex dynamics in vertebrates16,17. Yet, despite their relevance, our understanding of how lipid compositional heterogeneity is spatially organized and established at the organism level is limited.

MSI enables the study of the spatial organization of lipids, as it measures the distribution of compounds at micrometric scales in a parallelized and untargeted manner18,19. Today, MSI acquisitions are the measurement of choice to obtain unsupervised two-dimensional biochemical descriptions of samples. However, only by combining multiple MSI acquisitions can one create complete maps that survey lipids along organs and organisms (three dimensional; 3D) over time (4D) and upon perturbation (Supplementary Fig. 1).

We performed matrix-assisted laser desorption–ionization (MALDI)-MSI of entire vertebrate embryos at different stages of development to study the establishment of metabolically defined anatomical zones (lipid territories) (Fig. 1a,b). We considered embryos from D. rerio (zebrafish) that undergo substantial lipid remodeling during development20,21.

Fig. 1: Study design and peak-calling performance of uMAIA and other algorithms.
figure 1

a, Experimental design. Zebrafish embryos at selected developmental stages were sectioned, and MALDI-MSI was performed for alternating sections. b, Four-dimensional metabolic atlas analysis of regional and developmental trends. c, Illustration of uMAIA modules: (1) adaptive peak calling: image extraction from MSI acquisitions, (2) featurization: identifying ions across acquisitions and (3) normalization: reducing experimental variability in intensity distributions. d, Peak positions (Da) and intensities across three spectra (indicated with colors) for three molecules of comparable m/z (estimated values shown with dashed lines). Px., pixel. e, Scatterplot between mass shifts of two molecules of similar mass. Top right: Pearson’s R indicated. f, Heatmaps of extent and direction of mass shifts in space for molecules in d and two other molecules of similar mass. g, Illustration of the peak-calling approach. Spectra are reduced to a frequency (across pixels) histogram (left). Plots of 1,000 spectra stacked above corresponding histograms for 3 representative cases (right). h, Overview of uMAIA’s adaptive peak-calling algorithm (left). Intervals are iteratively expanded in the frequency histogram, decreasing a threshold (t0, t1, t2) as new intervals are considered. Peak intensity is integrated for each interval across spectra, generating images (right). i, Representative example of bins called using MALDIquant, Mirion and uMAIA over the frequency histogram (top) and corresponding images (bottom). Binning intervals shown by lines (color coded by method). j, Frequency histograms (top) and images (bottom) featuring a case of different peak-calling behavior among MALDIquant, Mirion and uMAIA. k, Sensitivity analysis on real data comparing uMAIA, MALDIquant and binning for Orbitrap and Fourier-transform ion cyclotron resonance (FTICR) MS acquisitions. Venn diagrams depict set sizes of molecules passing three assessment criteria to estimate true positive (TP) counts. Numbers within Venn diagrams indicate quantity of high-confidence TPs satisfying all criteria. FP estimate indicated (gray number within outer circle). l, Left: line plot tracking performance (average mutual information score) of uMAIA, binning and MALDIquant on simulated spectra (n = 50; shaded area depicts s.d.). Right: heatmaps displaying average mutual information score over simulations for different SNRs and peak densities (n = 50).

Studies have attempted to localize different lipid species in zebrafish. However, these analyses had limited spatial resolution or temporal scale, distinguishing just a few regions of the embryo22,23 or focusing only on small developmental windows24. A more detailed description of lipid spatiotemporal dynamics requires analyzing high-spatial resolution MS images of entire organisms over development. However, large MSI datasets are challenging to integrate due to technical variability in sample preparation and instrument performance. This variability causes artifacts, including missing detections at the level of pixels or molecules and nonbiological signal intensity deviations25,26,27, that are common to MSI datasets (Supplementary Fig. 2). Ad hoc measures, including the use of internal standards (ISs), have been devised to mitigate these effects, but they require prior knowledge of sample composition, which is incompatible with untargeted approaches27. For this reason, in previous work, MSI dataset analysis was limited to the application of supervised machine learning approaches or the study of major tissue compartments28,29,30. Here, we devised a computational framework, the unified Mass Imaging Analyzer (uMAIA), to tackle these problems. uMAIA allows the generation of 4D lipid atlases of whole organisms from large collections of raw MSI acquisitions (Fig. 1c).

With uMAIA, we mapped a sizable fraction of the lipidome of zebrafish embryos during development. We studied lipid spatiotemporal changes in the absence of external confounding factors, as zebrafish embryogenesis entails a closed metabolic system in which macromolecular precursors are provided by yolk storage8. We traced the emergence of lipid territories that recapitulate embryo anatomical organization and found unexpected distributions of lipid classes that are relevant for tissue-specific functions. Altogether, our work demonstrates that the vertebrate developmental program involves fine-grained and spatiotemporally regulated remodeling of metabolism. It also highlights that lipid distribution is an accurate and information-rich descriptor of vertebrate anatomy, reinforcing the link between cell identity and lipid metabolism31.

Results

Count-based adaptive peak calling obtains precise mass images

The uMAIA framework addresses limitations of standard MSI data processing. It achieves this by accurately extracting images from spectra (peak calling), combining them in a shared feature space across acquisitions (featurization) and normalizing intensities to minimize experimental fluctuations (normalization) (Fig. 1c).

Image extraction from raw MSI data involves identifying peaks generated by isobaric molecules across spectra (pixels). Instrument-driven fluctuations around peak m/z values (mass shifts) complicate this process. Post-acquisition alignment methods have been developed to minimize these errors32,33. However, calibration only partially corrects the problem32,33,34, as we show in different analyses (Fig. 1d–f and Extended Data Fig. 1a–e).

Thus, to extract images from MSI data, m/z intervals that are assumed to encompass signals from isobaric molecules must be considered35. Commonly used methods retrieve intervals of fixed sizes or for which extensions scale with m/z36,37 but do not account for variability in peak-specific mass shifts, which often causes artifacts. Adaptive approaches that attempt to encompass mass shifts by retrieving intervals from intensity peaks have been proposed as a solution38. However, such approaches bias intervals to consider mainly peaks with high intensities even though lower-intensity peaks contain equally relevant spatial information. Therefore, we reasoned that counting the number of detection events over spectra, disregarding intensity, could better expose peak mass shift distributions (Fig. 1g). Based on this principle, we devised an approach inspired by the watershed algorithm39 in which m/z intervals are initialized at histogram maxima and expanded until other intervals are encountered or reach a background level of event counts (Fig. 1h and Methods). This procedure produces m/z intervals tailored to the mass shifts of individual peaks.

We benchmarked our approach using data from different experiments and instruments against non-adaptive (Mirion37) and adaptive (MALDIquant38) methods. Inspection of images extracted by the different methods suggested that uMAIA’s peak caller effectively captured individual peaks (Fig. 1i and Extended Data Fig. 1f–i) and better resolved neighboring ones (Fig. 1j). uMAIA was more precise than Mirion in distinguishing quasi-isobaric compounds (2.3% versus 42% of images containing aggregated peaks) (Extended Data Fig. 2a). Conversely, MALDIquant generated multiple images that were similarly distributed and showed a checkerboard-like spatial complementarity, suggesting that the signals originated from single ions (Extended Data Fig. 2b). Notably, peaks detected by each method typically preserved signal spatial distributions (Extended Data Fig. 2c). Overall, uMAIA retrieved up to 55% more high-quality images (Methods) than the runner-up method, depending on the MSI technology (Fig. 1k and Extended Data Fig. 2d–f). These findings were confirmed with simulated data where we controlled for different signal-to-noise ratios (SNRs) and peak density (Extended Data Fig. 2g). uMAIA broadly outperformed existing methods, with a significantly greater signal recovery per peak than the next-best method (mutual information score uMAIA = 0.98, binning = 0.8, P value = 2.03−16) (Fig. 1l). Notably, when data were preprocessed with standard mass alignment33, uMAIA retrieved 33% more high-quality images (Extended Data Fig. 2h and Methods), while processing times remained within the same order of magnitude (Extended Data Fig. 2i).

Network flow-based matching creates a unified feature space

To analyze datasets comprising multiple acquisitions, peaks representing isobaric molecules must be identified (or ‘matched’) across different samples to featurize the data. Typically, this task is accomplished by aligning spectra across samples using reference peaks, followed by binning (Fig. 2a). While straightforward for a few samples and molecules, this approach is difficult to scale to tens or hundreds of acquisitions, which increases spurious matches. To obtain coherent matchings in large datasets, we conceived an unsupervised and alignment-independent approach that automatically links peaks and tested it on our dataset, for which we profiled the mass accuracy (Fig. 2a and Extended Data Fig. 3a–c). We consider the possible links between peaks and their mutual positioning to pose a network flow problem that can be efficiently optimized (Methods). This formulation incorporates constraints to avoid solutions representing inconsistent scenarios, including cases in which a molecule has multiple matches originating from one acquisition.

Fig. 2: Assessment of unified feature space by uMAIA’s peak matching and comparison with binning approaches.
figure 2

a, Illustration of the peak-matching problem to place acquisitions into coherent feature spaces. Ground truth, solutions of the standard binning approach and uMAIA. b, Illustration of sagittal sectioning of zebrafish at 72 hpf for MSI. Numbers indicate section position. c, Example of molecules matched across sections with detected m/z values shown (left) and visualization of the corresponding molecule, m/z = 731.45 (right). The color and number of the section refer to the scheme in b. d, Violin plots reporting the distribution of Jaccard distances between isotopolog M + 0 and M + 1 presence across acquisitions after featurization. Distributions are computed across all pairs of sections for 20 molecules with the highest signal intensity. Horizontal lines indicate means. e, Illustration clarifying the definition of feature ambiguity and score calculation. f, Density plots displaying frequency of different peak-matching outcomes for 50 MALDI matrix ions. y axis quantifies the number of sections from the experiment in b where featurization was successful. Max, maximum. g, Heatmaps of MALDI matrix ions (rows) identified over sections (column) for different binning sizes and uMAIA. Color coded according to whether zero, one, two or three peaks were identified within the bin for a given section. Matrix compounds are displayed as in f. h, Bar plots of the distribution of different peak-matching outcomes. x axis quantifies the number of sections in which a compound was identified across the dataset. Bars are colored according to the average ambiguity within the range. Vertical dashed line indicates total number of sections in the dataset. i, As in h, corresponding to uMAIA-retrieved sets. j, Bar plots of different peak-matching outcomes stratified by features: not detected in all sections (less than 20 sections), all sections (exactly 20 sections) or more than 20 sections for each method.

We tested our approach on simulated and real data and benchmarked it against binning procedures (Methods). First, experiments on simulated data highlighted that our method performed best for accuracy, recall and precision and was robust to the number of sections provided (Extended Data Fig. 3c). Next, we evaluated our approach on our zebrafish dataset (Fig. 2b,c). Due to the absence of a ground truth, we considered two cases with reliable expectations. In the first case, we used signals from [M + 0] and [M + 1] naturally occurring isotopologs, which should be detected in the same acquisitions. Selecting the 20 most intense lipid peaks and their isotopologs across the dataset indicated that uMAIA more consistently identified isotopologs in the same set of acquisitions than binning methods (Fig. 2d and Extended Data Fig. 3d). For the second case, we considered the 50 most intense MALDI matrix peaks (α-cyano-4-hydroxycinnamic acid; CHCA) that are present across acquisitions as matrix is applied to all samples. To evaluate the matching here, we introduced an ambiguity score to identify spurious matches by counting multiple detections from the same section that cannot originate from isobaric molecules (Fig. 2e). We quantified the occurrence of missing or multiple (ambiguous) matches using different binning sizes or uMAIA. We found that uMAIA maximizes the number of matched peaks while excluding ambiguous detections (Fig. 2f,g).

We next asked to what extent peaks could be matched across the entire dataset. When binning approaches were applied, the number of matched peaks and ambiguity scores scaled with bin sizes: smaller bins prioritized unambiguous detections at the expense of numbers of matched molecules, while increasing bin sizes retrieved larger groups but had higher ambiguity (Fig. 2h). Our approach restricts groups to be unambiguous and retrieves a greater number of peaks across all acquisitions, corresponding to a 20% increase in detected signals (uMAIA = 1,200, binning = 1,000), than the largest bin size tested (0.01) (Fig. 2h–j). This improvement was maintained even after spectral alignment (Extended Data Fig. 3e–g and Methods).

Thus, uMAIA extracts peaks from multiple MSI acquisitions and featurizes the dataset reliably. Importantly, uMAIA is annotation free, thereby enabling further analyses on larger portions of the data, and efficient: over 50 sections and 20,000 molecules can be featurized in <15 min (Extended Data Fig. 3h).

uMAIA transforms intensities to reduce technical variability

Even when peaks are called and matched, integrating MSI acquisitions remains challenging due to batch effects resulting from experimental variability that deform intensity distributions25. While, in the genomic field, many batch correction methods have been proposed40,41,42,43,44,45, MSI data substantially differ in signal distributions and batch effect properties.

We observed that logged intensity values of MALDI-MSI data exhibited bimodal distributions with low (background) and high (foreground) modes (Extended Data Fig. 4a). Variable magnitudes of intensity distortions were observed across foreground modes in peaks from consecutive sections. This was seen even for matrix peaks that should have similar dynamic ranges across sections (Fig. 3a). We hypothesized that the deformations have factorizable components: one molecule specific and another acquisition specific. To investigate this aspect, we derived an ad hoc approximated empirical estimator of batch effect shifts. We exploited a property of our dataset, which is that true biological intensity should vary smoothly between consecutive sections (Methods). Next, we visualized the matrix structure of batch effect estimates (Fig. 3b). Singular-value decomposition revealed that the matrix of empirically estimated batch effects was low rank: approximately 31% of its total variability could be explained by its rank-one approximation, confirming the hypothesis of a strong factorization of the batch effect (Extended Data Fig. 4b,c). Based on this verified relationship, we devised a probabilistic model to perform a regularized estimate of these factors and remove nonbiological distortion from intensity distributions (Fig. 3c, Extended Data Fig. 4d,e and Methods).

Fig. 3: Characterization of technical variability in MSI data: intensity distortions and uMAIA normalization.
figure 3

a, Intensity distribution densities of two CHCA (matrix) peaks across sections of the 72-hpf zebrafish dataset. b, Far left: heatmap displaying empirical estimates of foreground mode intensity shifts for 50 molecules (‘Batch effect characterization’ in Methods). Heatmaps on the right represent the first-, second- and third-rank approximation of the empirical matrix. Approximations were obtained by the outer product of the group of one, two or three singular vectors, respectively. c, Overview of the key modeling idea of distribution shifts. Signal distribution is approximated by a bimodal distribution (that is, a mixture of two Gaussians) with foreground and background mode. Observed foreground distribution parameters: center and standard deviation are subjected to displacement with an offset, which we consider factorizable in two sources (molecule and acquisition specific: λ and γ, respectively). d, MSI raw and normalized images with different methods (ComBat, scArches, uMAIA) for representative molecules across landmark sections of the 72-hpf dataset. Color bars span between the 1st and 99th quantiles (1st % and 99th %) of intensities across all sections. D, dorsal; A, anterior; L, lateral. e, Intensity distributions of molecules before and after normalization using different methods across sections. f, Low-dimensional representation (uniform manifold approximation and projection, UMAP) of pixels for raw data and outputs after ComBat, scArches and uMAIA normalization. Pixels are color coded by the section from which they originate. g, Spatial visualization of discrete clusters after application of the k-means algorithm on raw data and data processed with ComBat, scArches and uMAIA. Arrows indicate clear residual batch effects after clustering. The 72-hpf zebrafish data were used.

First, we tested our approach on two simulated datasets, applying distortions that mimicked batch effect characteristics from real datasets (Extended Data Fig. 4f and Methods). The first was constructed to facilitate interpretation: overlapping ellipsoids were generated with different average signal intensities. To challenge our model with more complex and realistic spatial distributions, we considered a second simulation based on expression patterns from the Allen Brain Atlas (ABA)46. We simulated batch effect intensity distortions for each section in both datasets, corrected the data with uMAIA, z normalization and ComBat, a popular method for batch effect normalization42, and evaluated how well each algorithm approximated ground truth intensities (Extended Data Fig. 4g–p and Supplementary Figs. 3 and 4).

uMAIA performed well in all cases and was both qualitatively and quantitatively superior to other methods in instances in which low-intensity values risked to be stretched to signal, implying appropriate regularization of the estimated parameters. This was indicated in the first dataset in which uMAIA recovered the ordering of the modes from raw data and reduced their variances, while ComBat and z normalization intermixed the intensities of distinct modes by assuming normally distributed data (Extended Data Fig. 4h–j). The result was also reflected in the ABA-based simulation: root mean square error scores between ground truth and normalized pixel values were lowest for uMAIA-corrected data (mean 89% improvement), followed by ComBat (mean 23% improvement) and worst for z normalization (mean 17% decline) (Extended Data Fig. 4m). Additionally, multivariate analysis qualitatively indicated that uMAIA-corrected data better recapitulated the ground truth (Supplementary Fig. 4a,b).

We next asked to what extent each method enabled biologically meaningful comparative analysis between regions. We selected five major ABA regions that spanned multiple sections and applied a two-sided t-test between pairs of regions using average values within sections (Methods). Comparing these results to ground truth images identified false negative (FN) and false positive (FP) detections (Supplementary Fig. 4c). uMAIA followed ground truth trends more often than other methods, resulting in lower FP and FN rates (FP rate (FPR) = 0.07, FN rate (FNR) = 0.12) than z normalization (FPR = 0.15, FNR = 0.24) and ComBat (FPR = 0.18, FNR = 0.22), approaching the lower bound (FPR = 0.05, FNR = 0.08; Methods and Extended Data Fig. 4o,p).

To assess the ability of the methods to correct intensity distortions in real data, we applied them to our featurized zebrafish dataset. In addition to ComBat, we tested a deep learning method, scArches, that recombines features to integrate data47. uMAIA normalization reduced high-frequency intensity differences that were likely attributable to experimental noise while retaining signal mean levels (Fig. 3d,e and Extended Data Fig. 5a–d). Without a ground truth for this dataset, we evaluated performance using multivariate and clustering analysis jointly applied to the pixels across sections, expecting similar clusters in consecutive sections with successful normalization. We used ComBat, scArches and uMAIA to normalize the dataset and applied principal-component analysis (PCA) followed by k-means clustering42,47 (Extended Data Fig. 5e,f). Pixel clusters after uMAIA normalization were the most consistent across sections (Fig. 3f,g and Extended Data Fig. 5f). In addition, alternative models of the batch effect were less effective at correcting intensity distortions (Extended Data Fig. 6 and Methods). Analysis of a proteomic and metabolite MSI dataset suggested that our method was also effective at reducing intensity distortions captured by different modalities (Supplementary Fig. 5).

Overall, uMAIA enables MSI data integration for whole-organism and organ atlases, helping reveal patterns in lipid distributions.

Lipid territories emerge and diversify over development

Previous studies indicated that lipids undergo substantial remodeling throughout zebrafish development20,21. To clarify the extent of such changes between developmental milestones, we performed bulk lipidomics by maximum-coverage, high-throughput targeted hydrophilic interaction liquid chromatography coupled to electrospray ionization tandem MS (HILIC–ESI-MS/MS or LC–MS/MS)48,49. We selected zebrafish embryos at four developmental stages: gastrula (8 h after fertilization (hpf)), late somitogenesis (24 hpf), pharyngula (48 hpf) and hatching (72 hpf)50 (Fig. 4a). We quantitatively profiled 850 individual lipid species in zebrafish embryos, revealing substantial lipid remodeling during development (Fig. 4b,c, Supplementary Fig. 6 and Supplementary Table 2). Specifically, hexosylceramide levels increased the most, rising more than tenfold from 8 to 72 hpf. Levels of ceramides, sphingomyelins (SMs), phosphatidylserines (PSs) and phosphatidylglycines (PGs) also increased, while those of phosphatidylcholines (PCs) and lyso-PCs (LPCs) significantly decreased (Fig. 4c). Triacylglycerols (TGs), primarily stored in the yolk and serving as energy and building blocks for embryogenesis, showed no collective concentration change (Fig. 4c and Supplementary Fig. 6a–c) but surprisingly increased for species with long fatty acid (FA) chains (≥54 carbon atoms) (Supplementary Fig. 6d,e).

Fig. 4: Four-dimensional metabolic atlas of zebrafish development.
figure 4

a, Overview of developmental stages used for bulk LC–MS/MS and MALDI-MSI acquisitions with selected developmental stages indicated. DG, diacylglycerol; LPE, lysophosphatidylethanolamine; LPI, lysophosphatidylinositol; PI, phosphatidylinositol. b, Low-dimensional representation (UMAP) of bulk LC–MS/MS data. Points represent lipids and are color coded by stage at which concentration peaks (left) or relative changes in concentration over time (right). c, Dot plot indicating overall lipid concentration changes and absolute quantities averaged over classes (left) and specific SM species (right) for each developmental stage. Cers, ceramides; HexCers, hexosylceramides. d, Four-dimensional distributions for selected lipids for analyzed time points (sagittal and dorsal projections, schematic of orientation shown in top row with yolk outlined in yellow). Note that the precise orientation of the embryo at 8 hpf is unknown. Min, minimum. e, Quantification of spatially localized lipids across the sampled time points. Upset plot depicting total molecule counts for each time point (left, n = 1). The time points that are considered within each set are indicated by solid black circles. Stacked bar plot depicting lipid class breakdown for different sets shown in the upset plot (right). f, Stacked bar plots representing the total number (left) and proportions (right) of spatially informative molecules detected at specified time points (n = 1) stratified by lipid class. g, Visualization of the first three principal components (P.C.) over each developmental stage, sagittal (S) and dorsal views. h, Tissue clusters (left) and pseudolineages (right) across sampled time points split generally into anterior and posterior metabolic regions. AR, anterior region; PR, posterior region. i, Correlation matrices between PC species for each developmental stage sorted according to optimal sorting at 72 hpf with SPIN69. Color bar indicates the number of unsaturations for each lipid species. j, Lipid intensity trends for PC species along the anteroposterior (AP) axis for each developmental stage (left). Shaded area indicates standard deviation between biological replicates (n = 2). Varying degrees of unsaturation are indicated by colors. Heatmap of mean-centered average lipid intensity for three species over developmental time (right).

However, bulk measurements cannot reveal tissue-specific compositional changes, thereby overlooking the importance of lipids during tissue patterning. To properly investigate lipid spatial distributions, we built a 4D atlas by collecting MALDI-MSI acquisitions of sagittal sections encompassing whole embryos at each developmental stage and applied uMAIA. We annotated 176 lipids by m/z matching using bulk lipidomics data and reconstructed their 3D distributions (Fig. 4d, Extended Data Fig. 7 and Supplementary Table 1). To increase our confidence in the annotation, we performed a global renormalization of MALDI and LC–MS/MS signals and calculated the correlation of lipids detected between methods (Methods and Extended Data Fig. 7g–i). Lipids that did not correlate were discarded as potential FPs. The images indicated a substantial spatial reorganization over time, with several lipids localizing to specific structures by 48 or 72 hpf (Fig. 4d).

We systematically assessed these changes and found that the number of spatially patterned lipids more than doubled from 26 at 8 hpf to 58 by 24 hpf, increasing to 77 at 48 hpf and 100 at 72 hpf (Fig. 4e,f, Extended Data Fig. 8a and Methods). The initial increase observed at 24 hpf aligns with the end of segmentation when primary organs and somites are established50. Using lipid levels as features, we performed PCA and pixel clustering at each time point. Resultant clusters were spatially coherent within embryos, defining regions that we named ‘lipid territories’. Analysis of lipid territories throughout development revealed an overall increase in lipid distribution complexity (Fig. 4g,h).

Using lipid composition similarity to connect lipid territories across time points, we observed that metabolically related territories occupied equivalent regions at each stage (Fig. 4h). As complexity increases during development, early lipid territories diversified, defining metabolic pseudolineages (Fig. 4h and Methods). Notably, unlike posterior territories, anterior metabolic regions did not appear to diversify, suggesting that their lipid composition is already defined at early stages (Fig. 4h). Supporting this finding, anterior territories at earlier stages harbored more diverse metabolic content than posterior ones (Extended Data Fig. 8b and Methods).

PC lipids underwent substantial spatial reorganization during development. Over time, a two-component structure emerged in PC lipid covariance, intriguingly revealing a partitioning according to unsaturation degree (Fig. 4i). PC species with fewer double bonds localized more anteriorly, mainly in the head, while those with more unsaturations were found posteriorly (Fig. 4j and Extended Data Fig. 8c–e).

Overall, this analysis indicates that lipids finely diversify spatiotemporally during development. Biochemical remodeling and lipid territory emergence occur over time, with increasing complexity present as development proceeds. Importantly, these results, revealed by spatial analysis, were not deducible from bulk lipidomic experiments.

Lipid-defined anatomy of a zebrafish embryo at 72 hpf

The compelling finding that lipid metabolism is spatially organized motivated us to directly investigate the correspondence between anatomical structures and biochemical organization. We focused on the zebrafish at 72 hpf, the most mature time point in our dataset, as it harbored the greatest complexity in tissue lipid composition.

Annotation using LC–MS/MS data identified 142 lipids (122 membrane lipids) in our MSI dataset. This enabled the 3D reconstruction of approximately 27.6% of membrane lipid species, accounting for 73.3% of the overall membrane lipid molar fraction (excluding cholesterol). The reconstructed lipids spanned various classes (in number of species; mol%), including PC (44; 66.64% of imaged lipids), TG (20; 26.71%), PS (16; 0.80%), phosphatidylethanolamine (PE) (15; 1.88%), LPC (12; 1.43%), PG (9; 0.05%), phosphatidylinositol (8; 0.37%), SMs (6; 1.29%), diacylglycerol (5; 0.74%), lysophosphatidylinositol (3; 0.0004%), lysophosphatidylethanolamine (2; 0.003%), ceramides (1; 0.05%) and hexosylceramide (Hex1Cer) (1; 0.04%) (Extended Data Figs. 7d–f and 9).

Remarkably, specific lipids delineated regions including the yolk and yolk extension (TG 52:4), the nervous system (PC 32:0), the hindbrain (PE 38:6) and musculature and internal organs (PC 32:2) (Fig. 5a,b and Supplementary Fig. 7a). This correspondence between lipid distributions and anatomy generalized across the lipidome, as revealed by low-dimensional data representations (PCA and nonnegative matrix factorization)51. The spatial distribution of pixel PCA coordinates indicated strong anatomical covariance of lipid abundance (Fig. 5c). Nonnegative matrix factorization factors further resolved this covariance by decomposing the data into interpretable lipid contributions, representing axes of lipid program modulation (Supplementary Fig. 7b). Additionally, diffusion map embeddings of pixels uncovered finer covariations, highlighting subtle distribution differences between related lipids including LPC 18:1 and LPC 22:6 (Supplementary Fig. 7c,d). A sorted distance matrix of voxel similarity in lipid space revealed a block-like pattern, suggesting a finite number of embryonic lipidic states (Supplementary Fig. 7e). Clustering voxels into discrete categories (Methods) resulted in lipid territories, which, after comparison with hematoxylin and eosin (H&E) staining, identified several organs, including the notochord, the swim bladder, the spinal cord and different brain regions (Fig. 5d,e). The correspondence between lipid territories and clusters was consistent across different clustering methods (Supplementary Fig. 7f–h). To explore finer subdivisions within lipid territories, we focused on the nervous system, rich in lipid diversity (Extended Data Fig. 8b), and reclustered. Interestingly, this revealed subregions corresponding to anatomical tissues including the eye and the optic cup that differed primarily in LPC, PC and PG concentrations (Fig. 5f and Supplementary Fig. 7i).

Fig. 5: Characterization of metabolically defined tissues in the embryonic zebrafish at 72 hpf.
figure 5

a, Three-dimensional reconstructions for four lipids with m/z value and lipid indicated (sagittal and dorsal views). Schematic indicating yolk, head and tail displayed above. b, Overlay of the four lipids visualized in a. c, Sagittal and dorsal views of the first nine principal components retrieved after selecting for annotated lipids. BR1, BR2, BR3, brain regions; dig, digestive system; NC, notochord; SB, swim bladder; SC, spinal cord. d, Visualization of metabolically defined tissue clusters after application of the unsupervised clustering algorithm (sagittal view). M1, M2, musculature. e, Top row: medial section (sagittal view) of H&E staining overlaid with contours of metabolic regions shown in d. Lipids that spatially localize to specific regions are indicated along with an inset of the H&E image. Both images are overlaid with contours representing delineated clusters. f, Left: subclustering of neural regions (top) with corresponding H&E sections (bottom). Right: dendrogram displaying relative distances between clusters. N.C., neural crest; MUS, musculature. g, Lipid clustering of a sagittal section at 72 hpf (left) and marker gene distributions from a consecutive section from HybISS (right). The importance of a gene in predicting lipid territories is indicated beside the gene name as a score. Scores are computed from the impurity decrease within each tree of a decision tree classifier and range from 0 to a maximum of 0.1.

We further tested lipid territory correspondence with anatomical structures by performing hybridization-based in situ sequencing (HybISS)52,53 on a sagittal section of the zebrafish at 72 hpf and MALDI-MSI on a consecutive section. A panel of tissue markers was used to highlight anatomy (Methods), while lipid-based pixel clustering was performed on MALDI-MSI data to retrieve lipid territories54,55,56. Several tissue markers faithfully localized to lipid territories, indicating that lipid distribution follows anatomy and that anatomy, in turn, can be inferred from lipid patterns (Fig. 5g).

Prompted by the tight correspondence with tissue–marker gene expression, we investigated whether lipid territories could be derived from metabolic gene expression. Analyzing publicly available single-cell expression profiles, we found that metabolism-related genes matched poorly with annotated cell identity due to their overall low expression. Specifically, lipid metabolic genes (for example, enzymes and transporters) were significantly less capable of predicting cell identity (R2 = 0.65; bootstrap estimate) than the most variable genes (R2 = 0.78) and performed similarly to random gene sets (R2 = 0.60)57 (Extended Data Fig. 10a,b and Methods). Therefore, gene expression data may not capture key predictors of metabolic activity.

To assess whether metabolic genes, although weakly expressed, could still localize to anatomical features, we expanded our HybISS approach to include highly variable metabolic genes (Methods). Except for a few cases, metabolic genes were typically noisier than marker genes and less capable of identifying lipid territories than MALDI-MSI data (Extended Data Fig. 10c–e). Only three of the ten most region-predicting genes were metabolic: apobb.1, apoeb and fabp1b.1 (Fig. 5g, Extended Data Fig. 10f and Methods).

Next, we biochemically characterized anatomical regions by performing an enrichment analysis of lipids across the delineated lipid territories and identified the one where each species was maximally abundant (Fig. 6a). Stratifying these assignments by tissue revealed that several regions, including the swim bladder, the digestive system and brain, were predominantly characterized by a few lipid classes. This indicates that lipid species from the same class tend to localize within identical anatomical structures (Extended Data Fig. 10g).

Fig. 6: Biochemical organization of metabolically defined tissues in the embryonic zebrafish at 72 hpf.
figure 6

a, Enrichment heatmap of lipids within clusters shown in Fig. 5d. b, Mol% composition of metabolite-defined clusters stratified by lipid class for 3D fish (n = 1). Pie plots from left to right show three lipids of increasing concentrations (lowest, median and maximum) and display localization according to mol%. c, Force-directed layout network representation of lipid colocalization. Lipid class is indicated with a color code. Images beside node clusters display representative lipid distribution of node clusters. d, Subset of TG species images (left) with FA compositions (right). Blue and red boxes indicate FAs with low and high carbon content and unsaturations, respectively. e, H&E (top left), brightfield (bottom left), genes from HybISS (top right) and lipids from MALDI-MSI images overlaid (bottom right) from a coronal section through the head of a zebrafish at 72 hpf (n = 1). Scale bars, 100 μm. f, H&E (left), brightfield (middle) and lipids from MALDI-MSI images overlaid (right) from a coronal section through the swim bladder of a zebrafish at 72 hpf (n = 1). Scale bars, 100 μm. g, Representative brightfield (BF) and Oil Red O (ORO) staining of control and sptlc1-mutant fish with a magnified view of the swim bladder in the right column. Scale bars, 1 mm. MO, morpholino.

By combining spatial and bulk lipidomics data, we obtained relative concentrations between lipid classes. As expected, PCs constitute a substantial proportion of lipids across clusters, while other classes showed more variability. For example, a divergent trend between PE and PS concentrations was evident (Fig. 6b) and may be ascribed to expression levels of pisd, which encodes a decarboxylase that converts PS to PE58. Accordingly, in the nervous systems, where pisd localizes57, PE concentrations were higher and PS concentrations were lower than in other regions (Fig. 6b and Extended Data Fig. 10h).

Similarly, a biochemical organization by lipid class emerged when lipid–lipid relationships were visualized in a connectivity graph based on spatial covariation (Fig. 6c). Diacylglycerols and lysolipids populated distinct nervous system regions, TGs were distributed between two clusters, SMs colocalized to the swim bladder, spinal cord and pericardial region, and PCs localized to two major lipid territories in the head and posterior tissues.

Interestingly, TG localization was generally dictated by FA carbon chain lengths and unsaturation levels. Shorter, less unsaturated TGs (≤54 C; less than four double bonds) accumulated in the yolk as energy reserves20. Surprisingly, long-chain, highly unsaturated TGs populated anterior anatomical regions distinct from the nervous system (Fig. 6c,d). In zebrafish embryos, neutral lipids are restricted to the yolk until 24 hpf, appearing in the head only at later stages20. Our bulk lipidomics data and spatial atlas consistently suggest that extra-yolk TGs are synthesized during development (Supplementary Fig. 6d,e) and contain longer (>20 C), more unsaturated FAs (Fig. 6d). Our atlas shows that, in the head, TGs localized to a mesenchymal compartment related to cartilage and bone primordia (Fig. 6e), aligning with evidence of extensive fat depots within bones59.

SM localization to the swim bladder was also unexpected. This organ, homolog of the tetrapod lung60,61,62, begins forming at 48 hpf and inflates around 96 hpf to regulate buoyancy. As in lungs, inflation requires lipid-laden surfactant secretion in the organ cavity to lower surface tension61,62. SMs began accumulating in the prospective swim bladder at 48 hpf, forming a distinguishable cluster by 72 hpf (ref. 63; Fig. 5d). Inspecting SM species distribution in coronal sections from embryos at 72 hpf, we found that they populate concentric layers of the swim bladder, possibly reflecting the organ’s known tissue structure63 (Fig. 6f).

Intrigued by their peculiar distribution, we assessed sphingolipid function in development by perturbing its biosynthesis. To this aim, we injected morpholinos into one-cell-stage embryos (Methods) targeting sptlc1, which encodes an enzyme that catalyzes the rate-limiting step in sphingolipid biosynthesis64. At 120 hpf, sptlc1 morphants displayed developmental defects including tail curling, pericardial edema (Extended Data Fig. 10i) and significantly impaired swim bladder inflation (Extended Data Fig. 10j). Notably, sptlc1-knockout lines were previously reported to exhibit faulty swim bladder inflation21. Finally, Oil Red O staining65 highlighted lower surfactant levels in sptlc1 morphants, without histological alteration (Fig. 6g and Extended Data Fig. 10k). These data indicate that sphingolipids accumulate in the forming swim bladder and are involved in surfactant production and inflation during zebrafish embryogenesis.

Altogether, this evidence reveals that, during embryogenesis, specific metabolic programs are established within different structures, shaping their formation and function.

Discussion

During embryogenesis, signals and spatially defined differentiation trajectories inform identical precursor cells to generate anatomical structures with distinct biochemical compositions. While morphogenic programs have been described well through the lens of transcriptomics, understanding how metabolism evolves spatiotemporally in the context of embryogenesis is essential to comprehend molecular processes that guide tissue patterning.

Here, we generated a 4D lipid metabolic atlas of vertebrate development to characterize the emergence of biochemical territories. To achieve this goal, we acquired a comprehensive dataset of high-resolution MS images from zebrafish embryos at different developmental stages. We also developed a general and accessible computational framework, uMAIA, which rethinks metabolomic MSI data processing toward the integration of multiple acquisitions. This advancement unlocks new possibilities for the construction of whole-organism metabolomic atlases while facilitating unbiased analyses and ensuring reliable cross-acquisition comparisons.

Analysis of metabolomic MSI data presents technical challenges that become relevant when analyzing multiple datasets. While analyses of large datasets have been pursued, for example, in relation to the study of the proteome and organ variability28,29, batch effects and intensity variations across acquisitions have limited the scope of spatial metabolomic and lipidomic investigations. Previous studies adapted to these constraints by applying supervised machine learning approaches28,30 or by restricting analyses to major tissue compartments29, as finer-scale analysis might have been confounded by technical variations. uMAIA addresses technical challenges by providing an integrated pipeline that enables both reliable cross-acquisition comparisons and unbiased detection of biochemical territories, expanding the analytical capabilities of large-scale metabolomic and lipidomic MSI studies. Furthermore, our initial tests also suggested applicability of uMAIA to other MSI datasets, including proteomics. Different acquisitions are made interoperable with conservative procedures and minimal assumptions: uMAIA creates normalized and coherent feature sets, has no black box components and does not impute values or create dependencies between features. Unlike deep learning integration methods, uMAIA is a principled and conservative normalization method that preserves biological variation, while it may retain extreme batch effects. This advancement in data handling facilitates the study of spatial metabolomics with other modalities to reveal new interactions and regulatory axes within tissues and cells, as was recently shown for lipotypes15.

In this work, we present a complete multivariate analysis of the MSI atlas we generated. Our results reveal a remarkable correspondence between lipid distribution and anatomy, reflecting how metabolic programs are organized based on cell type compositions of different tissues. Importantly, transcriptomic methods showed suboptimal sensitivity to capture the spatial distributions of key metabolic genes. This indicates a necessity to quantify lipids directly to address the lipidome’s spatial organization.

When examining spatiotemporal lipid level changes over development, we encountered unexpected findings. This included PC species spatially segregating based on unsaturation degrees. As key membrane lipids with distinct biophysical properties66,67,68, their varying tissue distribution likely affects membrane behavior. Determining how this pattern emerges mechanistically and understanding its physiological relevance are promising goals for future studies.

More generally, the colocalization of lipid classes to lipid territories suggested that territories are defined by metabolic pathway activity and that they are possibly involved in specific tissue functions. Examples include sphingolipid accumulation in the swim bladder, where they might function as surfactants, as well as long-chain TG species in bone primordia, where they may play roles in osteogenesis in the zebrafish head. We expect discoveries of this kind to help link diseases with potential therapeutic targets, as exemplified by the disruption in swim bladder function observed upon perturbation of sphingolipid biosynthesis.

Global assessment of lipid distributions during embryo development revealed a close parallel between the unfolding of lipid territories and organogenesis. Embryonic lipid territories did not specify simultaneously over time; the anterior regions, particularly the nervous system, exhibited the highest variation in metabolic composition and were the earliest to undergo fine metabolic patterning. Investigating whether this rostral lipidomic specification is conserved or more pronounced in species with finely regionalized central nervous systems, such as mammals, holds promise for future research. More generally, multispecies atlases will enhance our understanding of whether lipid patterning coevolved with brain expansion and increased complexity.

A limitation of our study is the inability to discern lipid transport from in situ de novo synthesis during embryogenesis. Additional experiments at higher temporal resolution and including metabolic labeling of newly synthesized lipids in combination with spatial transcriptomics would be necessary to address metabolic fluxes in space. Furthermore, due to ion fragmentation and sample quality differences, MSI is limited to capturing a fraction of molecular species that may alter abundance estimates.

In conclusion, by developing uMAIA, we have made the unbiased analyses of large, organismal-level MSI datasets tractable. Exploiting these capabilities, we conducted a whole-embryo exploration of an important compositional aspect of living systems: their lipidome. Our results reveal substantial heterogeneity along this underexplored axis, highlighting it as a compelling ground for discoveries in the regulation of development.

Methods

Data generation

Animal work and cryosectioning

Zebrafish embryos of the Tübingen long fin strain were kept at 28 °C until collection at 8, 24, 48 and 72 hpf. Following manual dechorionation, embryos were euthanized by fixation in 4% PFA at 4 °C overnight. After rinsing with 1× PBS, embryos were included in a block of 10% pork skin gelatin (type A, Sigma), which was allowed to solidify at 4 °C for 1 h, and then immediately frozen in isopentane prechilled to −65 °C. Specimens were kept at −80 °C until cryosectioning over the sagittal axis at −28 °C for a sectional thickness of 12 μm. Sections were stored at −80 °C until MALDI-MSI acquisition. All zebrafish husbandry procedures have been approved and accredited by the Federal Food Safety and Veterinary Office of the Canton of Vaud (VD-H23).

Matrix-assisted laser desorption–ionization mass spectrometry imaging datasets

Publicly available MALDI-MSI datasets were retrieved from METASPACE. Human kidney samples were generated by the NIH Kidney Precision Medicine Project (ID 2024-06-06_19h26m13s, 2024-06-06_19h22m09s, 2024-06-06_19h16m21s, 2024-06-06_19h08m24s, 2024-06-06_19h03m38s, 2024-06-06_18h57m58s, 2024-06-06_18h45m39s, 2024-06-06_18h40m58s), and baboon lung samples were generated by the Pacific Northwest National Laboratory (ID 2024-06-11_18h30m14s, 2024-06-11_18h28m14s, 2024-06-11_18h35m41s, 2024-06-11_18h26m59s, 2024-06-11_18h43m25s). The datasets were chosen such that the acquisitions represented similar samples, allowing us to assume that molecules should be present with similar intensities. From each section, images were retrieved on the METASPACE interface, and intensity ranges were set equal for all sections. Compounds that were not detected were marked as ’missing’, and those with aberrations were highlighted.

We collected MALDI-MSI datasets for zebrafish embryos. Sections from alternating sagittal sections were dried at room temperature before coating with CHCA matrix prepared at 15 mg μl−1 in a solution of acetone and water (1:1) and 0.1% trifluoroacetic acid for positive ionization. Matrix was deposited with a sprayer (Sprangler) at 350 rpm and a flow rate of 5 μl min−1 for 40 min. MALDI-MSI acquisitions were acquired using an AP-SMALDI5 AF system coupled to a Q Exactive orbital-trapping mass spectrometer in full MS scan and positive ion mode in the 400–1,200-m/z range. The MALDI laser focus was optimized manually using source cameras aiming at a diameter of the focused beam of <5 μm. For each pixel, the spectrum was accumulated from 50 laser shots at 100 Hz. MS parameters in Tune software (Thermo Fisher Scientific) were set to a spray voltage of 4 kV, S-Lens at 100 eV and capillary temperature to 250 °C. Calibration was performed regularly with ISs (that is, known matrix ions), and mass error was kept within ±2 ppm. A pixelated scan mode was used at a speed of 1.6 pixels per second. The spatial resolution for acquisitions of embryos at 24, 48 and 72 hpf was 7 μm. For the embryo at 8 hpf, a resolution of 5 μm was used to account for its smaller size. In addition to these sections used for atlas construction, we acquired sections from a biological replicate for each time point to check the consistency of results. A total of 96 MS images were acquired.

Hybridization-based in situ sequencing

The protocol was followed according to ref. 52. Zebrafish larvae were euthanized by fixation in 4% PFA at 4 °C overnight, embedded in 2% carboxymethyl cellulose and cryosectioned at 10 μm. Sections were stored at −80 °C and processed. For transcript detection, sections were incubated with padlock probes (concentration of 10–50 nM per probe) for marker and metabolic genes in the same mix. Samples were imaged on a Leica DMi8 epifluorescence microscope equipped with an LED light source (Lumencor, SPECTRA X, nIR, 90-10172), an sCMOS camera (Leica, DFC9000 GTC, 11547007) and a ×20 objective (HC PC APO, NA 0.8, air), with 10% overlap between tiles and 8–12 z-stack planes with 1-μm spacing.

For image processing, z projection was carried out using a variant of wavelet-based extended depth of field70. Tiles were stitched with ASHLAR71 using the DAPI channel of the acquisition. Intercycle registrations were obtained using the wsireg Python package72, which wraps elastix73. Spot detection was performed with Spotiflow74 using the pretrained HybISS model with default settings. Detections were decoded using a variant of starfish’s53 nearest-neighbor decoder.

Bulk lipidomic experiments

Zebrafish embryos were collected at the 8-, 24-, 48- and 72-hpf stages and anesthetized with tricaine. After manual dechorionation, 15 individuals per replicate were pooled together and immediately frozen in liquid nitrogen. For each developmental stage, four replicates were analyzed by the high-throughput targeted HILIC–MS/MS lipidomic workflow at the Université de Lausanne. Complex lipids were quantified using a high-coverage, stable isotope dilution LC–MS/MS approach. Zebrafish tissue (15 embryos) was extracted with isopropanol48 prespiked with the IS mixture containing 75 isotopically labeled lipid species (with multiple representatives per lipid class, with varying FA composition). The resulting extract was analyzed by the HILIC–ESI-MS/MS approach in positive ([M + H]+ and [M + NH4]+ adducts) and in negative ([M + AcO], [M − 2H] and [M − H] adducts) ionization mode, using a TSQ Altis LC–MS/MS system (Thermo Scientific), as previously described by Medina et al.48. In two separate runs (using a dual-column setup), 1,166 lipid species belonging to five major classes of complex lipids (glycerolipids, glycerophospholipids, cholesterol esters, sphingolipids and free FAs) were quantified with high precision and specificity. Using HILIC separation, endogenous lipids coelute with their corresponding IS, thus allowing for the appropriate correction of matrix effects (in the same solvent composition)49,75. Optimized lipid class-dependent parameters were used for data acquisition in timed selected reaction monitoring mode. Raw LC–MS/MS data were processed using TraceFinder Software (version 4.1, Thermo Fisher Scientific). Lipid abundance was reported as an estimated concentration using a mixture of ISs spiked at known concentrations and designed to correct for the differences in ionization and fragmentation efficiency depending on FA chain length and degree of unsaturation76.

Morpholino injection and Oil Red O staining

A translation-blocking morpholino (sptlc1-ATG MO, 5′-ACCCACTGTTGCCCCGACGCCATTT21) (Gene Tools) was used to knock down sptlc1 expression. Morpholinos were diluted to 0.1 mM. A bolus of 100 µm was injected in the cell of one-cell-stage embryos of the AB zebrafish strain. As a control, a mixture of random sequences was injected. Injected embryos were raised in an incubator at 28 °C with the addition of 1-phenyl-2-thiourea to inhibit melanin production to better visualize the swim bladder. The phenotype was assessed at 120 hpf. Non-injected siblings of the same clutch were also grown in the same conditions to control for any additional defects not related to the injections. At 120 hpf, about eight representative larvae of each condition were placed in depression wells and photographed. Larvae were immediately fixed in 4% PFA at 4 °C for 5 h. Fixed zebrafish were then rinsed with PBS, incubated in methanol at −20 °C for 24 h, briefly equilibrated in 60% isopropanol and stained with a solution of 0.3% Oil Red O (Sigma-Aldrich, O0625) in 60% isopropanol for 3 h as previously described20. Samples were then rinsed with 60% isopropanol. After a final wash with PBS, organisms were placed in depression wells and photographed.

Models and computational approaches

The uMAIA framework comprises three main algorithms designed to achieve the following tasks:

  • Peak calling for image generation from raw mass spectra

  • Peak matching for the creation of a unified feature space

  • Intensity normalization for batch effect removal.

Peak calling: processing of single mass spectrometry imaging acquisitions

The first algorithm processes raw mass spectra within single MSI acquisitions to call peaks and construct images.

Peak calling of mass spectrometry imaging spectra

Raw MSI spectra are stored as collections of tuples: detected m/z values with corresponding signal intensity and coordinates (mz, intensity, x, y). We round m/z values to the nominal mass resolution of the instrument to obtain N values (bins of size 10 × 10−5 were used). The data are loaded as a sparse matrix Sp,m containing the signal intensity values recorded at each pixel p (0, P) and position in the mass spectrum indices with m (0, N). Next, a histogram representation of the data, fm, is obtained by counting nonzero entries across pixels, that is, fm = ∑p(Sp,m > 0). A Gaussian kernel of small bandwidth (10 × 10−4, approximately reflecting the mass accuracy of the instrument) is applied to fm to reduce the impact of noise.

Next, the following two-step (algorithms 1 and 2) peak-calling procedure, inspired by the watershed algorithm, is applied to the histogram representation of the data to retrieve the set of peaks \({\mathcal{M}}\) and their boundaries \({\mathcal{B}}\). Algorithm 1 identifies seeding points based on histogram representation peak maxima, whereas algorithm 2 retrieves the bin sizes for imaging.

Image retrieval and further processing

To extract images from raw data, we aggregate Sp,m according to the set of boundaries \({\mathcal{B}}\) resulting from algorithms 1 and 2. For each pixel p and each compound c, we have \({X}_{p,c}={\sum }_{m\in {{\mathcal{B}}}_{c}}{S}_{pm}\). Images are normalized for total ion content by dividing each pixel by the total signal detected: Xpc ← Xpc/∑cXpc.

Outputs of the peak-calling method are (1) images saved in a .h5ad format and (2) a .csv file specifying the intervals that were selected among other metadata.

Algorithm 1 Retrieval of \({\mathcal{M}}\)

 1. Procedure InitializeBins (fm)                       Retrieval of unique m/z representing individual molecules

 2.  \(m\leftarrow \arg \mathop{\max }\limits_{m}\,{f}_{m}\)                                         Initialize the first m/z detection

 3.  \({\mathcal{M}}\leftarrow {\mathcal{M}}\cup \{m\}\)                                                 Add m to set

 4.  \(t\leftarrow \mathop{\max }\limits_{m}\,{f}_{m}\)                                       Initialize threshold to begin iterations

 5.  While t ≠ 0 do

 6.   t ← t − 1                                         Continue reducing the threshold

 7.   \({\mathcal{D}}\leftarrow \{d| \forall \,i\in {\mathcal{M}}:{f}_{i} > t\}\)                              Retrieve contiguous intervals above threshold

 8.   For all \(d\in {\mathcal{D}}\) do                                     Iterate over contiguous sequences

 9.    If \(d\cap {\mathcal{M}}=\varnothing\), then

 10.    m ← ccenter                                          Calculate the center of d

 11.    \({\mathcal{M}}\leftarrow {\mathcal{M}}\cup \{m\}\)                                         New molecule identified

 12. Return \({\mathcal{M}}\)

Algorithm 2 Determine the intervals \({\mathcal{B}}\)

 1. Procedure ExpandBins (\({\mathcal{M}},\,{\bf{f}}\))                                   Expand bins beginning with \({\mathcal{M}}\)

 2.  For all \(m\in {\mathcal{M}}\) do                                    Iterate over m/z of unique molecules

 3.   l ← m − 1

 4.   r ← m + 1

 5.   While f[l] ≥ f[l − 1] do                                  Expand the left boundary of the bin

 6.    l ← l − 1

 7.   While f[r] ≥ f[r + 1] do                                 Expand the right boundary of the bin

 8.    r ← r + 1

 9.   \({\mathcal{B}}\leftarrow {\mathcal{B}}\cup \{[l,r]\}\)                                              Store bin range

 10.  Return \({\mathcal{B}}\)

Network flow-based peak matching: optimized matching of peaks into a unified feature set

MSI datasets are not directly produced in the same feature space, differing in the number of identified peaks and their precise m/z positions. uMAIA automatically creates a unified feature space (featurization): a peak from each acquisition is connected with one from another acquisition if they are considered the same molecule, without requiring external references.

Connected peaks should satisfy a set of properties and constraints: closer peaks should be matched, it is better to match a peak than to leave it unconnected, and each peak can be connected only to another from a different acquisition (as they are the same molecule). These properties are not trivially achievable with a heuristic or procedural approach (for example, nearest-neighbor graph). We aim to identify the choice of connections that respects all properties and constraints.

To formalize the problem, we consider the following quantities:

  • Each of the n acquisitions provides a set of m/z peaks \({{\mathcal{A}}}_{1},{{\mathcal{A}}}_{2},\ldots ,{{\mathcal{A}}}_{n}\). For simplicity, we refer to \({\mathcal{A}}\) as the union of all sets, with \(\left\vert \right.{\mathcal{A}}\left\vert \right.=L\).

  • Considering each element in \({\mathcal{A}}\) as a node in a graph \({\mathcal{G}}\), we define Gij, the adjacency matrix indicating the possible connections that are allowed. This is defined using a distance threshold t so that Gi,j = 1 if dist(i, j) < t, forcing Gij = 0 if i and j are from the same acquisition.

  • The desired matching uMAIA achieves is represented by a graph \({\mathcal{X}}\) with an adjacency matrix Xij. Xij = 1 when peaks from two acquisitions are considered to have originated from the same ion or compound. Each connected component of this graph is then considered a feature and used to construct the final feature space.

  • Cij indicates the associated cost of a particular matching Xij = 1. The matrix is set by the distance in m/z between two peaks.

We aim to find the optimal \({\hat{X}}_{ij}={\rm{argmin}}\mathop{\sum }\nolimits_{i = 1}^{m}\mathop{\sum }\nolimits_{j = 1}^{m}{X}_{ij}{C}_{ij}\) while obeying a set of constraints. In its general form, the problem can be categorized as an integer linear problem (ILP), which is only tractable for small L values (while we have a large graph). However, two important simplifications we make here render it tractable in polynomial time. First, we divide the mass spectra into m/z intervals spanning one unit, allowing us to solve mz_range smaller subproblems. Second, we do not consider all the possible links Gij between any acquisitions but allow only peaks between some pairs of acquisitions. Specifically, we considered the acquisitions ordered and allowed edges between the peaks of an acquisition and the k consecutive ones (for all analyses, k = 2). Also, we introduce a start and end nodes in the graphs that can be connected with high-cost log (L) but are required by the constraint to complete a path. Because the solution of each subproblem can be found extremely fast using LP solvers, we solve it M times for M orders of the acquisitions obtained by random permutation and choose the best solution for each interval. The problem we solve can be formulated as follows:

$$\begin{array}{lll}\mathop{\min}\limits_{{\boldsymbol{X}}}&&\mathop{\sum}\limits_{i=1}^{L}\mathop{\sum}\limits_{j=1}^{L}{X}_{ij}{C}_{ij}\\ {\rm{subject}}\;{\rm{to}}&&{X}_{ij}\in \{0,1\}\\ &&{X}_{ij}\le {G}_{ij}\\ &&\mathop{\sum}\limits _{j}{X}_{ij}=\mathop{\sum}\limits_{i}{X}_{ij}=1\quad \forall i,j\notin \{\rm{start,end}\}.\end{array}$$
(1)

After solving the problem, features are retrieved by traversing all paths beginning from the start node until the end node is encountered.

Normalization: intensity correction for mass spectrometry imaging datasets

MSI experiments are prone to fluctuations in the detected signal intensity due to experimental factors, potentially distorting the intensity histograms of different acquisitions.

Pixel intensity histograms from MSI data are typically bimodal, consisting of background and foreground signal components. Not all images contain both components. Our framework models batch effects in the foreground signals, assuming background signals are relatively stable.

Our approach to correct batch effects follows these steps:

  • Estimation of the batch effect using a hierarchical Bayesian model

  • Use of the estimation to determine signal intensity transformations for each image

  • Application of functions for data normalization.

Theory, model and implementation of normalization

We consider the complete set of pixels p measured across all acquisitions a so that \(p\in {\bigcup }_{a}{{\mathcal{S}}}_{a}\), and denote ycp the ground truth value associated with each compound c measured. Our goal is to correct batch effects and recover ground truth values despite distortions Tca induced by the measurement process for each acquisition a and compound c where we measure

$${x}_{cp}={T}_{ca}(\;{y}_{cp})$$
(2)

with \(p\in {{\mathcal{S}}}_{a}\). To compare images from different acquisitions, we aim to estimate \({T}_{ca}^{-1}\) so that we can perform the transformation to correct the data.

$${\hat{y}}_{cp}={\hat{T}}_{ca}^{-1}({x}_{cp})$$
(3)

We propose to estimate \({T}_{ca}^{-1}\) from the intensity value histograms observed across acquisitions and compounds. We start by defining the ‘reference’ intensity distribution across pixels with a probability density function fc(ycp) or, equivalently, fc(y). The transformation xcp = Tca(ycp) induces a new probability distribution gca(x). The relation between g, f and T is simplified if we consider the cumulative density functions Gca and Fc:

$${G}_{ca}={F}_{c}\left({T}_{ca}^{-1}(x)\right),$$
(4)

which provides a way to estimate the desired normalizing transform \({\hat{T}}_{ca}^{-1}={\hat{F}}_{c}^{-1}\left({\hat{G}}_{ca}(x)\right)\). Because \({\hat{F}}_{c}\) does not depend on the acquisition, it can be chosen with some freedom, as all the normalization is performed by applying \({\hat{G}}_{ca}\) and F. We focus thus on estimating \({\hat{G}}_{ca}\) or, equivalently, \({\hat{g}}_{ca}\).

Generally after parameterizing \(g(x;{\hat{\theta }}_{ca})\), one would estimate the parameters \({\hat{\theta }}_{ca}\) by maximum likelihood:

$${\hat{\theta }}_{ca}=\arg \mathop{\max }\limits_{{\theta }_{ca}}\prod _{p\in {{\mathcal{S}}}_{a}}g({x}_{pc}| {\theta }_{ca}).$$
(5)

What complicates this endeavor is that the measured xpc coming from a single acquisition cannot be expected to be representative of the complete set of pixels \({\bigcup }_{a}{{\mathcal{S}}}_{a}\): foreground or background components may or may not be present within a given acquisition and biological variability would likely not be retained. Histogram matching therefore is not suitable for the problem: we need to combine information across acquisitions and compounds to obtain a robust \({\hat{T}}_{ca}^{-1}\).

To achieve this, we first consider the parameterization of g as a mixture of two Gaussians:

$$\begin{array}{rcl}g({x}_{pc}| {\theta }_{ca})=&&g\left({x}_{pc}| {\mu }_{ca}^{0},{\mu }_{ca}^{1},{\sigma }_{ca}^{0},{\sigma }_{ca}^{1},{\rho }_{ca}\right)\\ =&&{\rho }_{ca} \frac{1}{\sqrt{2\pi {\left({\sigma }_{ca}^{0}\right)}^{2}}}\exp \left(-\frac{{\left({x}_{pc}-{\mu }_{ca}^{0}\right)}^{2}}{2{\left({\sigma }_{ca}^{0}\right)}^{2}}\right)\\ &&+(1-{\rho }_{ca}) \frac{1}{\sqrt{2\pi {\left({\sigma }_{ca}^{1}\right)}^{2}}}\exp \left(-\frac{{\left({x}_{pc}-{\mu }_{ca}^{1}\right)}^{2}}{2{\left({\sigma }_{ca}^{1}\right)}^{2}}\right).\end{array}$$
(6)

We assume that parameters of the mixture can be factorized as to consider batch effect derived from acquisition- and compound-specific factors, given by γa and λc, respectively. Specifically, we choose \({\mu }_{ac}^{1}={\mu }_{ac}^{0}+{\gamma }_{a}{\lambda }_{c}+{\delta }_{c}\), where γa (−2, 2) λc (−2, 2) and a prior for δc normal(3, 1) represents the expected difference between modes. The prior for the background mode is \({\mu }_{ac}^{0} \sim {\rm{normal}}({m}_{0},1)\), with m0 chosen empirically by considering the distribution of σ1 of a set of naive two-component GMM fits. Similarly, \({\sigma }_{ac}^{1}={\Sigma }_{a}+{\Sigma }_{c}\). The following priors Σa, Σc ~ exponential(s), where s was chosen empirically by considering the distribution of σ1 of the full set of naive two-component GMM fits.

This corresponds to the following model:

$$\begin{array}{ll}{\delta }_{c}\sim {\rm{normal}}(3,1)\\{\lambda}_{c}\sim {\rm{uniform}}(-2,2)\\ {\gamma }_{a}\sim {\rm{uniform}}(-2,2)\\{\Sigma }_{a}\sim{\rm{exponential}}(s)\\{\Sigma }_{c}\sim {\rm{exponential}}(s)\\{\sigma }_{c}^{0}\sim {\rm{uniform}}(0.05,0.5)\\{\mu }_{ac}^{0}\sim {\rm{normal}}({m}_{0},1)\\{\mu }_{ac}^{1}={\mu }_{ac}^{0}+{\gamma }_{a}{\lambda }_{c}+{\delta }_{c}\\ {\sigma }_{ac}^{1}={\Sigma }_{a}+{\Sigma}_{c}\\{\pi}_{ca}\sim{\rm{Dirichlet}}(0.5,0.5)\\ {z}_{cp}\sim {\rm{Bernoulli}}({\pi }_{ca})\\ {x}_{pc}\sim \delta ({z}_{cp}) {\rm{normal}}({\mu }_{ac}^{1},{\sigma }_{ac}^{1})+\delta ({z}_{cp}-1) {\rm{normal}}({\mu }_{ac}^{0},{\sigma}_{c}^{0})\quad for\;p\in {{\mathcal{S}}}_{a}.\end{array}$$

The model was implemented and fit using the NumPyro probabilistic programming language. The MAP estimates of the parameters (\({x}_{pc},{\mu }_{ca}^{0},{\mu }_{ca}^{1},{\sigma }_{ca}^{0},{\sigma }_{ca}^{1}\)) are determined and used to compute the transformation. f was chosen as the bimodal Gaussian mixture model with the following parameters:

$$\begin{array}{r}{\mu }_{c}^{0}={\rm{mea{n}}}_{a}\left({\mu }_{ca}^{0}\right);\quad {\sigma }_{c}^{0}={\rm{mea{n}}}_{a}({\sigma }_{ca}^{0})\\ {\mu }_{c}^{1}={\rm{mea{n}}}_{a}\left({\mu }_{ca}^{1}\right);\quad {\sigma }_{c}^{1}={\rm{mea{n}}}_{a}\left({\sigma }_{ca}^{1}\right).\\ \end{array}$$
(7)

The CDF, Gca and the inverse CDF \({F}_{ca}^{-1}\) are evaluated numerically and used to compute the transformation \({\hat{T}}^{-1}_{ca}\).

Simulations and method evaluation

Analysis of single acquisitions and peak calling

Comparison with other methods

Freely available software was used for benchmarking:

  1. 1.

    MALDIquant: R package for MSI processing and calibration (adaptive binning) (Gibb & Strimmer38)

  2. 2.

    METASPACE: online platform for visualization and annotation (static binning and references m/z values)77

  3. 3.

    Mirion: software visualizing molecules in MSI data (static binning) (Paschke et al.37)

  4. 4.

    MSiReader: software visualizing molecules in MSI data (static binning and reference m/z values).

Imaging molecules requires identifying a range of m/z values that encompasses mass shifts for peaks. Ranges identified by each method were retrieved and compared with those from the uMAIA method (algorithms 1 and 2).

Evaluation of mass shifts within images

The degree to which mass shifts are correlated between molecules that are almost isobaric was evaluated by manually establishing ranges for each molecule in one MSI acquisition, and a reference m/z (the most frequent m/z detection) was identified. Next, for each molecule (that is, m/z range), a mass shift for each spectrum was calculated by subtracting the observed m/z from the reference m/z. Mass shifts for different pairs of molecules were then scattered together for the subset of pixels that contained both molecules. Pearson’s R was calculated for the same subset.

Simulations for tested approaches

We performed simulations that mimicked mass shifts from real data. First, a set of numbers representing theoretical m/z values was generated in a unit range. To create datasets of variable molecular crowding, we varied the size of the set of numbers. To model mass shifts, random values sampled from normal distributions (σ = 0.01) and γ distributions (k = 0.1 and θ = 0.04) (as approximated from real data) were added to the theoretical m/z values for each of the 1,000 simulated spectra in a way such that the order of the molecules after addition of noise still followed the original ordering of the molecules, preventing unreal inversions of m/z positions between spectra.

Next, noise ’spike ins’ were added at random across the range with variable intensities to reflect different SNRs. An SNR of 10 in a spectrum with 100 detections would mean that there are ten times as many real signals as noise, resulting in 90 true signals and ten noise signals.

For each dataset that varied in its molecular crowding and SNR, ten instances were simulated. Bins were retrieved by static binning methods (sizes of 0.02 and 0.04), MALDIquant and uMAIA. Mutual information scores were calculated for each peak with its prediction, and a weighted average was reported.

Image quality assessment of mass spectrometry imaging data

For each interval returned by the methods (uMAIA, Mirion and MALDIquant), the fraction of generated images likely to contain multiple different molecules was estimated by counting images for which the signal was contributed by more than a peak per spectrum or pixel.

We aligned bins from different methods by considering the extent of overlap between two ranges. With this information, we calculated the following:

  • The average intensity between two images referring to the same molecule from different methods

  • The spatial correlation between the two images

  • The number of uMAIA detections identified in individual MALDIquant ranges

  • The number of MALDIquant detections identified in individual uMAIA ranges.

We considered the number of MALDIquant detections that were within a given uMAIA range and calculated the average mutual information for each molecule set.

Three metrics were used to quantify various characteristics of image quality.

  1. 1.

    Spatial chaos considers intensities in an image and partitions them into levels. By assuming images arising from noise have equal distributions of intensity at each level, a score is assigned based on how different the sizes of the intensity partitions are. Codes for the spatial chaos metric were taken from https://github.com/alexandrovteam/pySM (ref. 78).

  2. 2.

    The noise model metric was devised to distinguish images with spatial structure. A basis function consisting of small Gaussian-distributed points spanning the image was used, and a generalized linear model was fit to the image using a Poisson distribution to account for noise. A likelihood ratio test against a null hypothesis was used to assess fit.

  3. 3.

    The power spectrum metric was devised to quantify high-frequency aberrations across images. As MSI data are acquired by scanning, it is not uncommon to see aberrations that consist of stripes in the image. Images were converted into its Fourier space, and the area representing low- and high-frequency signals was summed, and their ratio was calculated.

A threshold was set for each metric based on manual inspection and kept the same for all analyzed datasets. These datasets included a DESI, FTICR and four samples from an AP-SMALDI instrument. The number of images that surpassed all thresholds for their respective metrics was calculated for each dataset and method.

Peak matching

Simulation to assess matching

To mimic mass shifts in real data, the same simulation was done as described above, varying the number of sections over ten realizations. The ILP optimization was run and benchmarked to a binning approach in which the size of the bin was equal to 2 s.d. of the specified noise distribution. Metrics including precision, accuracy and recall were calculated.

Evaluation of peak matcher on real data

The peak-calling algorithm was applied to 20 embryonic zebrafish sections. Theoretical m/z values were retrieved for molecules in each section, which were subjected to either our network flow-matching method or traditional binning methods. Four different bin sizes (0.001, 0.0025, 0.005, 0.01) were used to group together compounds across the acquisitions.

Detection accuracy assessment was performed in two ways. First, 50 abundant matrix compounds across acquisitions were considered. To assign an interval or peak to one of these compounds, we used the overlap with the theoretical m/z value as the criterion. For each individual acquisition, we considered a score of ambiguity computed by counting the number of observations falling inside the interval over the single one expected. Second, to score how consistently isotopolog pairs (M, M + 1) were present after featurizing with our approach versus standard binning, we considered the two binary vectors representing the N and N + 1 isotope presence across acquisitions, respectively, and computed both Jaccard and Euclidean distances between the vector pairs. This was performed only for the 50 most abundant compounds to avoid stochastic dropouts occurring closer to the detection threshold, confounding the analysis.

Finally, we considered all molecules present across the acquisitions. Four different bin sizes (0.001, 0.0025, 0.005, 0.01), including our ILP approach, were used to group together compounds across the sections.

Batch effect characterization

Empirical estimation of the error matrix to study batch effects

To verify that the normalization model is justified by the error structure of the data (that is, that fluctuations in the data are a combination of sample-dependent and feature-dependent factors), we consider a simplified empirical estimation framework to estimate the error. This empirical estimation exploits the fact that our acquisitions are sequential sections of the same structure.

The estimation is based on assumptions:

  1. 1.

    The observed background mode is equal to the true background mode \({m}_{ca}^{0}={\mu }_{ca}^{0}\).

  2. 2.

    As sections are consecutive, the high-frequency deviations in foreground mode shifts between consecutive sections are attributable to the batch effect, which we desire to remove. In other words, biological signals should vary smoothly across sections.

Following the two equations from the model described above:

$${\mu }_{ac}^{1}={\mu }_{ac}^{0}+{\gamma }_{a} {\lambda }_{c}+{\delta }_{c},$$
(8)
$${x}_{pc} \sim \delta ({z}_{cp}) {\rm{normal}}({\mu }_{ac}^{1},{\sigma }_{ac}^{1})+\delta ({z}_{cp}-1) {\rm{normal}}({\mu }_{ac}^{0},{\sigma }_{c}^{0}),\quad \,\text{for}\,p\in {{\mathcal{S}}}_{a}.$$
(9)

Assuming a simple factorization of the noise, we write a reparameterization:

$$\begin{array}{r}{\mu }_{ca}^{1}={\mu }_{ca}^{0}+{\lambda }_{c}{\gamma }_{a}+{\delta }_{c}+{\epsilon }_{c}{\lambda }_{c}+{\epsilon }_{a}{\gamma }_{a}+{\epsilon }_{c}{\epsilon }_{a}\\ {\epsilon }_{a} \sim {\rm{normal}}(0,\Sigma )\\ {\epsilon }_{c} \sim {\rm{normal}}(0,\Sigma ).\end{array}$$
(10)

We factor out from this expression the real biological signal yca:

$$\begin{array}{r}{m}_{ca}^{1}={m}_{ca}^{0}+{\lambda }_{c}{\gamma }_{a}+{\delta }_{c}+{\epsilon }_{c}{\lambda }_{c}+{\epsilon }_{a}{\gamma }_{a}+{\epsilon }_{c}{\epsilon }_{a}+{y}_{ca}.\end{array}$$
(11)

Considering this equation and the assumption above, empirically estimate the variables \({\hat{\delta }}_{v}\) and \({\hat{y}}_{i,v}:\)

$$\begin{array}{ll}{\hat{\delta}}_{c}&={\text{mean}}_{a}({m}_{ca}^{1}-{m}_{ca}^{0})\\{\hat{y}}_{ac}&={\text{mean}}_{a\in[i-2,i+2]}({m}_{ca}^{1})-{\text{mean}}_{a\in[0,i-2]\cup[i+2,A]}({m}_{ca}^{1}).\end{array}$$

We can then approximate the matrix Mca, which quantifies batch effect shifts for each molecule and section.

$${M}_{ca}={m}_{ca}^{1}-{m}_{ca}^{0}-{\hat{\delta }}_{c}-{\hat{y}}_{ca}$$
(12)
$${M}_{ca}=({\gamma }_{a}+{\epsilon }_{a})({\lambda }_{c}+{\epsilon }_{c})$$
(13)

Singular-value decomposition was applied to the matrix to factorize it into two components representing acquisition- and molecule-dependent effects, and the first eigenvalue from singular-value decomposition is reported. A bootstrapped estimate of the first eigenvalue of this matrix was calculated to retrieve its variance by sampling 80% of molecules for 50 repetitions.

Simulation datasets for normalization method and evaluation

Two simulated datasets were created to assess the performance of various normalization algorithms. The first dataset evaluated methods on a phantom object, while the second used distributions taken from ISH experiments (ABA).

The first dataset is an oversimplified structure constructed from three ellipsoids in a 3D space, with two of the three ellipsoids partially overlapped to facilitate result interpretation. Molecules were simulated to be present in one or more than one of the ellipsoids. When a molecule was assigned to an ellipsoid, pixel values in the region were drawn from normal distributions with a mean sampled from a normal distribution. In cases in which a molecule was present in ellipsoids that overlapped, the intensities were summed at the overlap, resulting in a third mode within intensity distributions. We simulated sectioning of the data by retrieving 2D images.

Each section was then perturbed with batch effects to mimic the effect seen from real data. Specifically, factors representing section- and molecule-specific noise were drawn from a normal distribution. Next, these parameters were used to derive and transform the intensity distributions from ground truth data to batch effect affected.

Another simulation was performed using murine brain ISH images from the ABA. Images were normalized by minimum–maximum rescaling to reflect the dynamic range present in MALDI-MSI data after logging the data ([−7, −1]). Next, batch effect noise was applied as above. Noise-perturbed data were corrected using z normalization, ComBat and uMAIA, and quality of the correction was calculated via the root mean square error between corrected and ground truth data. Because treatment by the different normalization methods places the output data in different scales, a line was fit to the q–q plot between ground truth and corrected data. The slope of the line was used to place the two distributions in similar ranges so that intensity distributions within the image could be compared between methods.

Ability to perform regional differential intensity testing

ABA ISH data were used (refer to simulations in the preceding section). For each gene, a differential expression test was made for all pairs of five selected regions by considering the set of pixels belonging to each region. For each tested pair, a matrix representing TP and TN values was constructed (TP identified when P < 0.05). The process was repeated for perturbed data, and data were corrected using different methods (z normalization, ComBat and uMAIA). FP, FN, TP and TN instances could then be ascertained when compared to ground truth. The ground truth FPR and FNR were estimated by bootstrapping sections belonging to the ground truth.

Evaluation of algorithm on real data via principal-component analysis and clustering approaches

We evaluated pixel clustering before and after normalization and compared the result with an L/S algorithm and an imputation–integration algorithm (ComBat, scArches). All molecules present in at least 15% of pixels were used as features for clustering. Clustering was performed with the k-means algorithm, and the number of clusters was set to 30.

Alternative models for batch effect correction were tested and evaluated. Specifically, these models included a batch effect that was given by λ + γ, γ and λ. The outputs of these models were evaluated and compared to raw data and uMAIA results in PCA and UMAP space (by transforming all results by the same principal-component bases) and clustering with k means as above. Results for each model were summarized by calculating Wasserstein distances between intensity distributions of corrected data for neighboring pairs of sections. The mean over all section-wise comparisons for each molecule was calculated and reported.

Evaluation on proteomic and metabolic mass spectrometry imaging datasets

Proteomic MSI datasets from ten consecutive kidney sections were retrieved from ref. 29 and processed with uMAIA using default parameters. Raw and corrected data distributions were summarized by calculating Wasserstein distances between intensity distributions of neighboring pairs of sections as above. The mean over sections for each molecule is calculated and reported. PCA coordinates across the sections without feature selection were calculated and displayed. The analysis was repeated with four sections representing similar tissues from the NIH Kidney Precision Medicine Project (Data availability).

Evaluation on aligned data

Spectra were aligned using MSIWarp, an open-source Python library (https://github.com/horvatovichlab/MSIWarp). For alignment, one reference spectrum was used to calibrate all acquisitions from the zebrafish at 72 hpf with the following parameters: bandwidth = 15, σ = 3.0 × 10−7, ε = 1.55.

We assessed the performance of the peak-calling modules from MALDIquant and uMAIA on the 72-hpf dataset after spectral alignment. For all images reconstructed between methods, we evaluated quality metrics as described in ‘Analysis of single acquisitions and peak calling’. Images are counted as TPs if they are extracted by a method and they surpass two image quality metrics from either method, as FPs if they are extracted by the method but do not pass the quality metrics or as FNs if they were not called by a method while passing TP criteria for the other method.

The assessment of peak matching after spectral alignment involved featurizing the dataset using various bin sizes (0.001 Da, 0.0025 Da, 0.005 Da, 0.01 Da) and uMAIA. Next, for each method, we quantified the percent of features (that is, set of corresponding peaks) with missing detections across sections. For each feature, we estimated the maximal number of expected detections Df across sections by counting the sections that contained any detection within 0.01 Da from the feature centroid. A feature was considered ‘incomplete’ if the number of sections where it was detected was lower than Df. Variance of the estimator was computed with a jackknife procedure.

Benchmarking tests to compare processing speeds of uMAIA’s peak caller and alignment were performed using real data. Spectra were stacked to test larger dataset sizes.

Zebrafish analyses

Lipid annotation

Annotation of lipids

Peak annotation was achieved by considering the m/z values retrieved from bulk quantitative lipidomic experiments for only sum compositions of lipids. All lipids identified by bulk LC–MS/MS in positive and negative ion mode were considered, and H+, Na+, K+, H–H2O+ and NH4+ adducts were selected for matching to MALDI-MSI data. MALDI compounds were matched to the nearest neighbor (within 0.01 Da) in m/z from possible annotations.

In cases of multiple lipids matching a single m/z value, including isobars and compounds with similar m/z values, the signal was attributed to the most abundant lipid as measured by bulk lipidomics. To disentangle isobaric PC and PE species as well as PC-P/O and PE-P/O their relative abundance was specifically assessed (Supplementary Fig. 11b,c), and the m/z value was assigned to the most abundant lipid. When comparable amounts of isobaric species were found, multiple annotations for the same m/z value were reported. Lipid identity was assessed by querying the SwissLipids database via the METASPACE online platform78. Lipid annotations unconfirmed by SwissLipids (denoted by ’no’ or blank entries in the ’Confirm_lipid_identity’ column of Supplementary Table 1) should be interpreted with caution.

FP annotations were further removed by considering bulk LC–MS/MS data and MALDI-MSI data jointly. Global ad hoc renormalization was implemented by identifying four coefficients to rescale all lipid concentrations from MALDI-MSI data that would maximize the average similarity between datasets. Lipids with quantities that did not result in a positive correlation coefficient were discarded.

Developmental analyses

Quantitative bulk liquid chromatography coupled to tandem mass spectrometry analysis

Data were converted to mol% (measured concentration/total lipid concentration). Averages across replicates for each lipid and time point were calculated. To produce volcano plots, fold change in concentration was calculated between fish at 8 and 72 hpf with P values (Student’s t-test; n = 4). The UMAP representation was colored according to the time at which lipids were maximally abundant after dividing lipid concentration by the average.

Identification of spatially informative molecules across time points

To identify spatially informative lipids, we implemented a statistical test based on Moran’s I spatial autocorrelation index. For each lipid image, we calculated the observed Moran’s I and compared it to a null distribution established through pixel value reshuffling. To account for local baselines, we stratified pixels based on their associated total ion count measurements into eight bins. Next, independently for each lipid, the randomization procedure shuffled pixel values within these strata, maintaining a global notion of underlying tissue structure while disrupting specific lipid patterns. For each lipid, we computed the mean and standard deviation of the null distribution of Moran’s I from the realizations (N = 100) and calculated the z score for the observed Moran’s I, which was converted to a P value. P values were adjusted (Bonferroni correction) to control for multiple comparisons. Lipid images with adjusted P values below 0.01 were classified as showing significant spatial structure.

Four-dimensional metabolic atlas and multivariate analysis

Sagittal sections of zebrafish embryos at 8, 24, 48 and 72 hpf were processed with uMAIA modules. All molecules that had nonzero intensity values in at least 15% of pixels in 80% of sections were kept for downstream analysis.

To construct the volumetric data, consecutive slides were aligned using affine transforms. A 3D array was constructed, concatenating all the images, and smoothed (Gaussian filter; σ = 0.4). Arrays were visualized in napari.

Next, PCA was computed, and the first ten components (explaining over 90% of the dataset’s variance) were considered. The k-means algorithm was applied for unbiased clustering of pixels.

Computationally tracing metabolic pseudolineages

To link lipid territories together across developmental stages, a pairwise squared Euclidean distance matrix between clusters was constructed considering the sequence of time points. Specifically, we start by initializing the distance matrix Dij to a large value (2.5). Next, we consider consecutive time points t and t + 1, and we subset the lipids that were detected at both time points to obtain two matrices of cluster averages \({X}_{km}^{t}\) and \({X}_{lm}^{t+1}\quad \forall m:m\in {\text{feat}}_{t}\ {\rm{and}} \ m\in {\text{feat}}_{t+1}\), and squared Euclidean distances between these pairs were computed and set as values of Dij so that \({D}_{kl}\leftarrow \parallel {X}_{lm}^{t+1}-{X}_{km}^{t}{\parallel }_{2}\). Next, Dij was sorted by optimal leaf ordering. We show clusters in that order and display them to be spaced proportionally to the distance between consecutive clusters. The Hungarian algorithm was applied to match the clusters of each pair of time points t and t + 1.

Atlas at 72 hpf and multivariate analysis

Twenty sagittal zebrafish sections at 72 hpf were processed with uMAIA. Before peak matching, the lowest two percentiles of intensity peaks were removed. After normalization, all molecules that had nonzero intensity values in at least 15% of pixels in 80% of sections were kept for downstream analysis.

Low-dimensional embeddings (nonnegative matrix factorization and diffusion map) were computed using as feature all the peaks annotated as lipids (‘Lipid annotation’). A signal enhancement procedure inspired by that described by Pagoda 2 (ref. 79) was applied, in which each feature is rescaled by a factor proportional to its variance. The adaptation consists of using Moran’s I as a scaling factor instead of the observed–expected variance ratio because typical Poisson mean variance scaling is not expected in MSI data. Specifically, after minimum–maximum normalizing each lipid feature to 0–1, each lipid was multiplied by the median Moran’s I score achieved across sections, and the embedding routines were called on the rescaled matrix. Lipids modules were associated with top lipids sorting the positive loadings of each factor.

Lipid–lipid correlation, enrichment and graph representation

A pairwise correlation distance matrix was calculated across lipids and visualized after column–row sorting using the SPIN algorithm69.

To construct the cluster enrichment heatmap, we applied k means and, for each cluster, averaged the intensity for each lipid, resulting in a matrix with rows (dimension number of regions) and columns (dimension number of lipids). Each column was minimum–maximum normalized to highlight the region in which lipids were most abundant.

To obtain the graph representation of the lipid–lipid similarity at each time point, a cluster enrichment score was calculated for each lipid. Pairwise Euclidean distances between enrichment scores were computed. The matrix of distances was used to compute a force-directed layout using the 500 edges with the smallest distances considered.

Selection of metabolic genes, scRNA-seq and HybISS analysis

To identify zebrafish genes involved in metabolic processes, we used the KEGG database, including genes associated with enzymatic reactions and metabolite transport80. Single-cell transcriptomics data of zebrafish embryos at 72 hpf were obtained from the Zebrahub database57. We selected three subsets of genes: the 400 most variable genes, the 400 most variable metabolic genes and 400 random genes. For each subset, a training set of total cells was used to fit a linear discriminant analysis model to predict cell types. Predictive accuracy was assessed over 100 bootstrap iterations. Among these genes, a set of 50 marker genes and 50 metabolic genes was selected for the HyBISS experiment.

HybISS data analysis: for each section, individual transcripts were binned into 13.6-μm (40 pixels squared) areas, serving as proxies for cells. Genes with fewer than five counts and meta-cells with fewer than five genes were excluded. We used the SCANPY Python package54 to cluster cells with similar transcriptional profiles and Squidpy55 to generate spatial plots of these clusters. Cluster annotation was performed by correlating the mean expression of the HybISS cluster with that of annotated clusters from the scRNA-seq dataset, selecting clusters with the highest correlation scores. The analysis was repeated using only marker genes or metabolic genes.

To compare the information content between marker genes, metabolic genes and lipids, HybISS data were aligned with lipid data using Fiji. We considered all marker gene expression distributions, performed PCA across the feature space and applied Leiden clustering to the top ten principal components (Traag et al.56). Next, we retrieved bootstrap (n = 35) estimates of each set’s (marker genes, metabolic genes, lipid distributions) ability to predict cluster identity: decision trees (scikit-learn implementation) with shallow depths (four) were used as classifiers. To identify the most informative genes for each lipid cluster, a decision tree was fit per cluster using gene expression as independent variables and binarized cluster labels as response variables (1 when belonging to the cluster in question, 0 otherwise). Feature importance was computed from the impurity decrease within each tree of the classifier for each lipid cluster.

uMAIA package

uMAIA is coded in Python, and its source is available at https://github.com/lamanno-epfl/uMAIA.

Relevant packages

Figure

Required packages

Fig. 1

MALDIquant v1.22.1

Fig. 2

gurobipy v9.1.2

Fig. 3

ComBat v0.3.0, scArches v0.5.4, umap-learn v0.4.6, scikit-learn v0.24.2

Fig. 4

napari v0.4.17, umap-learn v0.4.6, scikit-learn v0.24.2, SciPy v1.10.1

Fig. 5

napari v0.4.17, scikit-learn v0.24.2, SciPy v1.10.1, Spotiflow v0.5.0

Fig. 6

NetworkX v2.5

Extended Data Fig. 1

METASPACE (online interface)

Extended Data Fig. 2

MALDIquant v1.22.1, MSiReader v1.0

Extended Data Fig. 3

MSIWarp v0.1, gurobipy v9.1.2

Extended Data Fig. 4

pyro-ppl v1.8.0, pyro-api v0.1.2, ComBat v0.3.0, brainmap v0.2.3

Extended Data Fig. 5

ComBat v0.3.0, scArches v0.5.4, scikit-learn v0.24.2

Extended Data Fig. 6

umap-learn v0.4.6

Extended Data Figs. 79

napari v0.4.17

Extended Data Fig. 10

Spotiflow v0.5.0, umap-learn v0.4.6

Extended Data Figs. 3 and 4

brainmap v0.2.3

Extended Data Figs. 6 and 7

napari v0.4.17, brainmap v0.2.3

  1. v, version.

Statistics and reproducibility

No statistical method was used to predetermine sample sizes. No data were excluded from the analyses. MSI experiments were not randomized and did not have covariates. Morpholino experiments were not randomized, and covariates were controlled by merging different clutches, and random embryo groups were allocated to morpholino or control experiments. A pool of non-injected siblings was kept under the same conditions to control for phenotypic defects unrelated to injections. The investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.