Extended Data Fig. 1: Validation of the statistical model robustness.
From: Optimizing multiplexed imaging experimental design through tissue spatial segregation estimation

(a) Distribution of R2 values for the saturation model described in equation (1) across the Visium datasets. The thick line corresponds to the median, and the bottom and upper limits of the box correspond to the first and third quartiles, respectively. The lower and upper whiskers correspond to the lowest and highest values, respectively, within the range of the first (third) quartile minus (plus) 1.5 times the interquartile range. (b) Distribution of R2 values for the power-law model described in Equation (2) across the Visium datasets. The lower and upper whiskers correspond to the lowest and highest values, respectively, within the range of the first (third) quartile minus (plus) 1.5 times the interquartile range. (c) Approach used to estimate the impact of clustering granularity on τ. (d) Relationship between τ and the number of clusters for the cerebellum sample. (e) Comparison of the τ parameter value between datasets generated using targeted (n = 7) and un-targeted (n = 22) Visium platforms. The p-value was computed using a two-sided Mann-Whitney rank test. Large bars correspond to the median and small bars to the IQR. (f) Comparison of number of cell phenotype detected in datasets generated using targeted (n = 7) and un-targeted (n = 22) Visium platforms. The p-value was computed using a two-sided Mann-Whitney rank test. Large bars correspond to the median and small bars to the IQR. (g) Comparison of the alpha (left panel) and C (right) parameters between datasets generated using targeted (n = 7) and un-targeted (n = 22) Visium platforms. The p-value was computed using a two-sided Mann-Whitney rank test. Large bars correspond to the median and small bars to the IQR. (h) Mean KL divergence between sampled and total cell composition vs number of sampled regions for FoVs ranging from 100 to 500 µm for a Visium breast cancer dataset. Each point corresponds to the mean number of recovered cluster across 50 similar simulations, and vertical bars correspond to the standard error. The red dashed lines correspond to individual fits for each w value. (i) Distribution of R2 values for the KL sampling model described in Equation (3) across the Visium datasets. The lower and upper whiskers correspond to the lowest and highest values, respectively, within the range of the first (third) quartile minus (plus) 1.5 times the interquartile range. (j) Comparison of θ values from healthy (n = 7) and tumor samples (n = 15). The p-value was computed using a two-sided Mann-Whitney rank test. Large bars correspond to the median and small bars to the IQR. (k) Comparison of θ and τ parameter values across the Visium samples. (l), (m) and (n) Relation between individual phenotypes τ and their abundance in three Visium datasets, breast cancer, lymph node and heart respectively. Dashed black line corresponds to the power-law model described in Equation (4). Each dot corresponds to a specific cell phenotype.