Extended Data Fig. 1: Overview of datasets and prediction performance.

a, Visualization of cells in the eleven spatial transcriptomics datasets colored by the expression of the highest-expressed gene in each respective dataset. Abbreviations are as follows: hippocampus (Hipp.) primary visual cortex (VISP), prefrontal cortex (PC), middle temporal gyrus (MTG), somatosensory cortex (SC), gastrulation (Gast.), U-2 OS cell line (U2OS). b,c, Performance of all three gene prediction methods (Harmony, SpaGE, Tangram) on all datasets as measured by (b) gene-wise mean absolute error between predicted and actual gene expression over 10-fold cross-validation, and (c) gene-wise Pearson correlation between predicted and actual gene expression over 10-fold cross-validation. Shown also are the number of cells (n) in the spatial transcriptomics datasets and the number of genes (p) shared between spatial and RNAseq datasets. In panels b-c, the inner box corresponds to quartiles of the metrics and the whiskers span up to 1.5 times the interquartile range of the metrics.