Fig. 3: Characterization of technical variability in MSI data: intensity distortions and uMAIA normalization.
From: Unified mass imaging maps the lipidome of vertebrate development

a, Intensity distribution densities of two CHCA (matrix) peaks across sections of the 72-hpf zebrafish dataset. b, Far left: heatmap displaying empirical estimates of foreground mode intensity shifts for 50 molecules (‘Batch effect characterization’ in Methods). Heatmaps on the right represent the first-, second- and third-rank approximation of the empirical matrix. Approximations were obtained by the outer product of the group of one, two or three singular vectors, respectively. c, Overview of the key modeling idea of distribution shifts. Signal distribution is approximated by a bimodal distribution (that is, a mixture of two Gaussians) with foreground and background mode. Observed foreground distribution parameters: center and standard deviation are subjected to displacement with an offset, which we consider factorizable in two sources (molecule and acquisition specific: λ and γ, respectively). d, MSI raw and normalized images with different methods (ComBat, scArches, uMAIA) for representative molecules across landmark sections of the 72-hpf dataset. Color bars span between the 1st and 99th quantiles (1st % and 99th %) of intensities across all sections. D, dorsal; A, anterior; L, lateral. e, Intensity distributions of molecules before and after normalization using different methods across sections. f, Low-dimensional representation (uniform manifold approximation and projection, UMAP) of pixels for raw data and outputs after ComBat, scArches and uMAIA normalization. Pixels are color coded by the section from which they originate. g, Spatial visualization of discrete clusters after application of the k-means algorithm on raw data and data processed with ComBat, scArches and uMAIA. Arrows indicate clear residual batch effects after clustering. The 72-hpf zebrafish data were used.