VISTA uncovers missing gene expression and spatial-induced information for spatial transcriptomic data analysis

Liu, Tianyu; Lin, Yingxin; Luo, Xiao; Sun, Yizhou; Zhao, Hongyu

doi:10.1038/s42003-025-09479-6

Download PDF

Article
Open access
Published: 08 January 2026

VISTA uncovers missing gene expression and spatial-induced information for spatial transcriptomic data analysis

Communications Biology volume 9, Article number: 203 (2026) Cite this article

2960 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Understanding cell activities in their spatial context is crucial for revealing spatially influenced cellular states. While single-cell RNA-seq (scRNA-seq) provides comprehensive gene expression profiles, it lacks spatial resolution. Subcellular spatial transcriptomics (SST) captures spatial information but measures only limited genes. To overcome this problem, we present VISTA, a model that predicts unmeasured gene expression in SST data by integrating scRNA-seq and SST through variational inference and geometric deep learning with uncertainty quantification. Across four datasets, VISTA achieves superior imputation accuracy, scalability, and efficiency. Its accurate imputation supports diverse downstream analyses, including identifying spatially variable genes, uncovering ligand-receptor interactions, inferring spatial RNA velocity, performing in-silico perturbations, and disentangling spatial versus intrinsic expression variations. By bridging comprehensive molecular coverage with spatial precision, VISTA enhances the interpretability and utility of SST data, advancing our understanding of tissue organization and cellular microenvironments.

DestVI identifies continuums of cell types in spatial transcriptomics data

Article 21 April 2022

Differential gene expression analysis of spatial transcriptomic experiments using spatial mixed models

Article Open access 14 May 2024

SpaIM: single-cell spatial transcriptomics imputation via style transfer

Article Open access 23 August 2025

Introduction

Single-cell RNA sequencing (scRNA-seq) has matured into a robust technology that allows for the measurement of the complete gene expression profile of individual cells within various tissues^1,2,3. Despite its advancements, scRNA-seq does not account for the spatial positioning of cells, a critical aspect known as spatial information. To bridge this gap, spatial transcriptomic technologies have been developed, enabling the investigation of the cellular spatial context⁴. These technologies fall into two primary categories: (1) Imaging-based methods, including smFISH^5,6, MERFISH⁷, seqFISH⁸ and Xenium^9,10; and (2) Sequencing-based methods, including STARmap¹¹ and Slide-seq¹². Compared to spot-based spatial technologies^4,13, these methods facilitate the capture of gene expression at a single-molecule/subcellular resolution, which can then be aggregated to deduce gene expression levels and cell structure in single-cell resolution. The analysis of single-cell gene expression enriched with spatial information holds great promise for understanding spatial-induced information.

However, the scope of genes sequenced by imaging-based and sequencing-based spatial transcriptomic methods is notably constrained, typically capturing approximately 30 to 10,000 genes^5,7,9,14. This is considerably less than the roughly 20,000 to 30,000 genes profiled by scRNA-seq³. To address this limitation, researchers have developed methodologies for the prediction of unmeasured gene expression within these datasets by aggregating nearest neighbors of cells for regression^15,16,17,18, joint probabilistic modeling^19,20,21, and transport^22,23,24. These methods have been thoroughly benchmarked^10,25. These imputation tools utilize scRNA-seq data from the corresponding tissue to enhance the spatial transcriptomic data, and the results can then be employed in downstream analyses such as the identification of spatially variable (SV) genes²⁶ and the examination of cell-cell interactions (CCIs)²⁷.

Despite these progresses, there are limitations in the existing methods for imputing spatial transcriptomic data in terms of both performance and utility in practice. One primary issue is their subpar imputation quality for large-scale datasets. For instance, datasets sequenced using Xenium, which may contain more than 100,000 cells, yield a Spearman correlation coefficient below 0.5 in validation sets for mouse brain data¹⁰. These methods also lack scalability when applied to large-scale datasets. Tangram, for example, suffers from out-of-memory (OOM) errors when applied to large-scale datasets (smaller than 10,000 cells), and SpaGE exhibits excessively long runtime (more than 40,000 seconds). While gimVI can accommodate large-scale datasets by adjusting the batch size, it does not model spatial information and necessitates a compromise between scalability and running time as the smaller the batch size is, the longer time we need for the same device. Additionally, none of the methods sufficiently account for the discrepancy between reference scRNA-seq data and the target spatial transcriptomic data. Consequently, there is a clear need for more in-depth analysis of the existing methods to overcome the drawbacks in their designs and to develop a more effective approach to the imputation challenge.

At the same time, few methods can estimate the uncertainty of gene expression levels and discuss the effect of prediction errors on downstream applications. TransImp²⁸ estimates the uncertainty based on a post-imputation regression, which requires extra fitting steps. TISSUE²⁹ states that it can produce well-calibrated uncertainty measures for gene expression prediction, but it needs to separate samples to create the calibration set and select genes with high certainty.

To bridge the gap between imputation outcomes and actual gene expression data, we develop a pipeline tailored for the task of imputing spatial transcriptomic data. This process involves a thorough analysis of the reference scRNA-seq and target spatial data pairs, as well as the integration of spatial information during model training. Our proposed solution, named VISTA, leverages a joint probabilistic modeling approach. We combine a Graph Neural Network (GNN)^30,31 architecture and graph sampling strategy³² with Variational Inference (VI)³³ for Spatial Transcriptomic Analysis. VISTA is trained to predict missing gene expression levels with graph sampling for large-scale spatial data, using scRNA-seq data as a reference. By leveraging the contributions of highly correlated genes for training, VISTA utilizes probabilistic modeling for each cell and gene to estimate the uncertainty in our imputation results. Our uncertainty quantification method does not require sacrificing samples in the training dataset to generate calibration datasets while integrating this estimation approach directly into our pipeline. We demonstrate the performance of our proposed method through a series of experiments of imputation by evaluating it from both statistical and biological perspectives. Finally, our case studies of exploring Spatially Variable (SV) genes, Cell-Cell Interactions (CCIs), RNA velocity, and in-silico perturbations suggest that our imputed spatial data can yield more biological insights for complex cellular systems. In summary, the innovation of VISTA is not just incremental integration of GNN and VI, but a principled fusion of these paradigms into a single generative probabilistic graph model that (i) scales to atlas-level datasets, (ii) quantifies the reliability of predictions, and (iii) broadens downstream interpretability. This combination has not been realized by existing spatial-omics integration tools.

Results

Overview of VISTA

The main innovation of VISTA lies in its joint probabilistic modeling framework that integrates spatial topology and expression profiles within a unified variational inference architecture, coupled with built-in uncertainty quantification, a capability missing or only partially implemented in published work. Our goal is to predict the expression levels of missing genes for all the cells from the spatial data. VISTA jointly models the likelihood of reference scRNA-seq data and target spatial data with geometric machine learning. (Fig. 1 (a)). Before fitting our model, we first select reliable genes as anchor genes for these two modalities by thresholding correlation coefficients and significant levels. We assume the gene expression of gene g in cell n from scRNA-seq data x_ng is sampled from a Zero-inflated Negative Binomial (ZINB) distribution p(x_ng∣z_n, s_n, l_n) and spatial data expression ${x}_{ng}^{{\prime} }$ is sampled from a Negative Binomial (NB) distribution $p({x}_{ng}^{{\prime} }| {z}_{n},{s}_{n},{{\mathcal{N}}}_{n})$. Here z_n represents the joint latent space shared by data from these two protocols, s_n indicates the type of protocol (0 for scRNA-seq and 1 for in-situ sequencing). Moreover, for scRNA-seq data, we use l_n to model the sequencing depth, and for spatial data, we use ${{\mathcal{N}}}_{n}$ to model the spatial information by considering the neighbors of cell n. Our target is to approximate the posterior distribution of the latent variables z_n, known as $q({z}_{n},{l}_{n}| {x}_{n},{s}_{n},{{\mathcal{N}}}_{n})$, by using a Joint Variational Auto-encoder (JVAE)³⁴ with Graph Neural Network (GNN). To optimize VISTA, scalable stochastic optimization techniques³⁵ are employed, ensuring efficiency even with large datasets. Furthermore, the model provides an uncertainty measure for the imputed values based on the distribution of the generative model for each cell and gene (Fig. 1b). We sample cells or genes from the distribution and compute the cosine similarity between the sampled data and the estimated mean value for uncertainty estimation. More detailed methodologies of VISTA are described in the Methods section. Our methodology improves both model performances and result interpretation, suggesting the soundness and efficacy of its design.

VISTA can better impute the expression levels of the unobserved genes with uncertainty quantification than other methods for the osmFISH-brain dataset

Utilizing one dataset known as osmFISH-brain^6,36, a relatively small-scale dataset commonly used for benchmarking analysis, we demonstrate VISTA’s efficacy. Figure 2a illustrates the distribution of cell types within the spatial data, highlighting specific patterns such as the clustering tendency of Pyramidal cells, hinting at spatial influences on their distribution.

**Fig. 2: The comparison of imputed results for osmFISH-based data (known as osmFISH-brain).**

Our benchmarking methods for this imputation task include²⁴, gimVI¹⁹,SpaGE¹⁷, TransImp²⁸, ENVI²⁰, SpatialScope²¹, and SpaIM³⁷. Most of the baselines as reported by recent benchmarking studies^10,25. Our metrics also come from a recent benchmarking studies²⁵. SpatialScope failed in all the datasets due to OOM errors in the diffusion model training step, thus we did not include its results in the following sections. VISTA and other baselines were tuned to their best hyper-parameters, ensuring a fair comparison.

As shown in Fig. 2 (b), VISTA outperformed other methods in several metrics, such as Spearman correlation coefficient (SCC), structural similarity index measure (SSIM)³⁸, and Jensen-Shannon divergence (JS), and held a second place in terms of root mean square error (RMSE). These results suggest the better imputation of VISTA. The comparison and subsequent ablation test further revealed that the integration of spatial information during model training significantly enhances imputation quality. This is evidenced by the superior performance of VISTA over gimVI, which is detailed in Supplementary Note 1.

VISTA quantifies uncertainty in two approaches, denoted as VISTA_gene and VISTA_cell, corresponding to the evaluation of genes and cells with certainty levels above the median, respectively. This quantification process is validated in Fig. 2b, which shows an increase in SCC when genes or cells with uncertainty rankings in the top 50% are removed. The filtering process, while potentially increasing variance due to a small testing size, still illustrates the reliability of the produced results. Moreover, we explored the relationship between uncertainty and location or expression levels. Supplementary Fig. 1a shows two genes with low uncertainty, while (b) shows two genes with high uncertainty. We include both the spatial distribution and histogram for these genes. We found that genes with relatively high expression levels and clear spatial patterns tend to have low uncertainty, whereas genes with high uncertainty tend to low expression levels. Therefore, our method assigns higher reliability for genes with stronger signals, which makes sense from the perspective of uncertainty quantification.

The expression patterns of reliable genes, ascertained by VISTA, are shown in Fig. 2c. The top three genes display a strong alignment between the predicted and observed values, both in terms of distribution in the spatial space and magnitude.

In Fig. 2d, we display the contribution of imputation for discovering marker genes of different cell types. The raw marker genes are shown in Supplementary Fig. 2. After imputation, we identified more marker genes and linked these genes to the analysis of spatial transcriptomic data. For example, Kbtbd11 showed strong spatial expression patterns, and cells from Pyramidals also have such patterns. Such discovery can also be made by other baselines such as SpaGE, TransImp and ENVI shown in Supplementary Fig. 3. This consistency shows the role of VISTA in enhancing biological discovery through spatial transcriptomic data refinement. Finally, we include quantitative comparisons for uncertainty estimation in Supplementary Note 1. Again, VISTA performed better than TISSUE in estimating both the cell-level and gene-level uncertainty.

VISTA can better impute datasets from multiple technologies and scales than other methods

In the context of large-scale spatial transcriptomics, the challenge of imputing missing genes is magnified by the substantial increase in dataset size. The Xenium technique is capable of producing datasets with over 160,000 cells, whereas the seqFISH technique which can produce ~60,000 spots. These techniques offer a stark contrast to the smaller-scale datasets like osmFISH, STARmap, and Visium^39,40. This vast increase in scale presents two main challenges: maintaining accurate imputation while ensuring the scalability to handle such large-scale datasets.

To evaluate imputation methods efficiently for these large Xenium-based^10,41 and seqFISH-based datasets⁴², a bootstrap approach was employed to sample 10% of the cells. This sampling strategy allowed for a practical and computationally feasible benchmark without compromising the robustness of the statistical inference. Figures 3a, b shows the spatial distribution of cell types within the Xenium-breast and seqFISH-embryo datasets, revealing distinct spatial patterns among certain cell types, such as Malignant cells and Allanois cells, respectively. This visualization supports the hypothesis that spatial pattern is a property of certain cell types. Figure 3c showcases the benchmarking results for the Xenium-breast dataset, with better performance of VISTA to impute unobserved gene expressions within Xenium-based datasets. VISTA ranked at the top for SCC and consistently had high performance across other metrics, together with the additional benefit of uncertainty quantification. Figure 3d shows that VISTA surpasses other methods in terms of the SCC, SSIM, and JS. Notably, when uncertainty-filtered genes are considered, VISTA demonstrates superior performance across all metrics. For the Xenium-breast dataset, Fig. 3e shows the genes had high certainty and also had very high SCC, by comparing with the ground truth information for the Xenium-breast dataset. KRT7 achieved an SCC of 0.9, and the visualization also shows the consistency between the imputed result and the ground truth information. For the seqFISH-embryo dataset, Fig. 3f shows that VISTA can also impute the spatial enrichment information of reliable genes. For example, for Sox4 and Tcf7l1, part of the spatial enrichment information of these genes was preserved after imputation. We also discussed the performance of VISTA on imputing the unobserved gene expression levels for the Xenium-brain dataset, which is a Xenium-based dataset, shown in Supplementary Note 2.

**Fig. 3: Imputation results comparison for Xenium-based (known as Xenium-breast) and seqFISH-based (known as seqFISH-embryo) datasets.**

To demonstrate the contribution of VISTA in imputing unmeasured genes across different data scales, we also include one ST (Visium)-scRNA-seq pair (denoted as Visium-mouse) from mouse muscle tissue^39,40. This dataset has a smaller number of spots (982) compared with other datasets. Supplementary Fig. 4 shows that VISTA with the uncertainty measurement can still outperform selected baselines in three out of four metrics, demonstrating our flexibility in handling data with different scales. Moreover, to summarize our superiority, we included heatmaps marked with statistical test significance between VISTA_gene versus all other methods, shown in Supplementary Fig. 5. This figure shows that VISTA can produce results with significant improvement versus most of the selected baselines across various datasets and metrics. Therefore, VISTA, as a method empowered by uncertainty quantification, could be a better candidate for the imputation task.

We also test the effect of different embedding-fusion methods, including addition (Addition), mean (Mean), and concatenation (Concat), and visualize the performance difference in Supplementary Fig. 6. According to this figure, addition is a generally better approach. To compare the performance of different methods across all the datasets (the osmFISH-brain dataset, the Xenium-breast dataset, the Xenium-brain dataset, and the seqFISH-embryo dataset) comprehensively, we averaged the rank of different methods in each dataset and summarized them in Supplementary Fig. 7a. VISTA outperformed other methods in three out of four datasets by considering the average rank, which implied the strength for imputing missing genes.

The analysis of uncertainty estimation emphasizes the collection of paired sequencing data

Since the improvement of the uncertainty quantification for the seqFISH-embryo dataset and the Xenium-brain dataset was not clear, we delved into the conditions under which uncertainty quantification can work well as the explainability of VISTA. We hypothesized that the reliability of the imputation results is related to the similarity between the reference scRNA-seq and the target spatial data, with results shown in Supplementary Fig. 8a–c. We used the pre-computed correlations of each shared gene between the reference dataset and the target dataset, plotted the relation between correlations and uncertainty scores, and then computed the SCC based on these two metrics. For the Xenium-breast dataset, we found that the SCC is negative and significant, which supports the rationality of our uncertainty quantification method. However, for the results based on the rest two datasets, the correlation was not at a similar significant level (p-value = 0.07 for the Xenium-brain dataset and p-value=0.16 for the seqFISH-embryo dataset). Therefore, we advocated the collection of this type of data for analyses.

The robustness of uncertainty estimation and neighbor searching

Since our uncertainty estimation methods require the computation of vector similarity as well as the sampling of cells, it is important to demonstrate the robustness and the rationale of the current solution with sensitivity analysis. We first considered replacing the cosine similarity (CS) with Euclidean distance (ED) and selected genes with lower ED as genes with higher certainty, and the comparison is shown in Supplementary Fig. 9a. According to this figure, ED did not perform well in selecting genes with high imputation reliability, which suggests the limitation of using ED to compute the distance of high-dimensional data. Furthermore, we also considered alternating the number of sampled cells (L) and the result is shown in Supplementary Fig. 9b. Based on this figure, the results are not sensitive to the choice of L, and thus our uncertainty estimation approach is robust with respect to the tuning.

We also considered adjusting the solutions of neighbor searching based on spatial locations in FAISS⁴³, including L2 distance (L2), Inner Product (IP), and Locality Sensitive Hashing (LSH), and the results are shown in Supplementary Fig. 9c. Based on this figure, we also found that using different strategies for neighbor searching did not change our results apparently, and thus our method is also robust to different neighbor searching approaches.

We finally explored the effect of approaches to compute the uncertainty metric with different statistics, including variance, median, and the combination of these statistics, shown in Supplementary Fig. 10, which shows that the contribution of the variance-based approach is not as good as filtering genes based on the median value evaluated based on all four metrics.

VISTA facilitates the discovery of CCIs and SV genes

The process of imputing missing genes in spatial transcriptomics is a critical step for enhancing the resolution of biological functions at the molecular level. One primary goal of this imputation is to enable the identification of SV genes and the discovery of ligand-receptor pairs that are essential for CCI studies, which are central to understanding the complex communications in tissue microenvironments. Before deploying the imputation models on the entire Xenium-based datasets, a preliminary assessment of the scalability of these models was essential to ensure their feasibility in processing large-scale data without computational constraints. This scalability check is needed to identify any models that may not have been initially designed to handle datasets of such scale, as indicated by the reference to Fig. 4a. We found that some models including TransImp and gimVI encountered OOM errors when dealing with the extensive Xenium-based datasets (with gimVI failing specifically on the Xenium-brain dataset), VISTA retained its robust performance. Notably, VISTA was not only capable of processing all three datasets but also maintained a relatively stable running time, even as the scale of the datasets increased. This suggests better scalability of VISTA, making it a desirable model for large-scale spatial transcriptomic data. Such efficiency is important, where the volume of data is rapidly growing, and the computational demands are continually escalating.

Fig. 4: Evaluation of different methods based on scalability and biological function discovery (sample size (n) = 10 for each group; center point, median; box limits, upper and lower quartiles; whiskers, up to 1.5 × interquartile range; points, outliers).

Firstly, we considered evaluating the clustering performance of imputation results. We computed the Normalized Mutual Information (NMI), Adjusted Rand Index (ARI) and Average Silhouette Width (ASW) scores⁴⁴ based on the clustering after imputation and the original cell-type labels. According to Fig. 4b for the osmFISH-brain dataset, VISTA achieved the best NMI score and its ARI and ASW scores were comparable to the best model. Based on Fig. 4c for the Xenium-breast dataset, VISTA achieved the best scores for three metrics, and surpassed Tangram and SpaGE. According to Fig. 4d for the seqFISH-embryo dataset, VISTA still performed well in the comparison based on the NMI and ARI scores. These results suggest that VISTA is not only robust in imputing missing data but also excels in preserving the biological information of the raw data. The original cell-type-specific information appears to be well-retained after imputation, as reflected in the high scores across different metrics. This fidelity in clustering post-imputation is crucial, as it ensures that subsequent biological interpretations and analyses remain grounded in the cellular heterogeneity before imputation.

Secondly, we considered evaluating the ability of imputation results on SV gene identification. Here, we used both Moran’s I score⁴⁵ and SpatialDE (as well as SpatialDE2)^26,46 scores to evaluate the correlation between gene expression and spatial location. These two methods have less sensitivity to induced sparsity and SpatialDE has a low false-positive rate⁴⁷, and other comparable methods including SPARK-X⁴⁸ had OOM errors in analyzing large-scale imputed spatial data. We described these two scores in the Methods section. According to Fig. 4b–d, all methods except Tangram and SpaGE performed well. SpaGE did not find genes whose expression levels were related to spatial location because Moran’s I score was 0. Moreover, VISTA achieved good and stable scores across the three datasets for both Moran’s I score and SpatialDE score. We also compared the imputed gene expression levels of SV genes and Highly Variable (HV) genes. Figure 4e shows that the imputed SV genes discovered by SpatialDE had similar expression patterns with the HV gene identified in the paired scRNA-seq dataset. Due to the running time limit, we recorded the SV gene identified by Moran’s I test rather than SpatialDE and HV gene from paired scRNA-seq dataset for the Xenium-breast dataset, shown in Fig. 4f, with similar conclusions. Therefore, VISTA’s results incorporate gene variance from paired scRNA-seq data and exhibit similar patterns spatially so we may gain more insights to analyze the similarity among multi-omic data. Overall, these evaluations demonstrate the effectiveness of VISTA in preserving and revealing the spatial heterogeneity of gene expression, which is crucial for understanding the spatial biology of tissues. The stable performance of VISTA across different datasets and metrics demonstrates its utility in the analysis of spatial transcriptomics.

Finally, we evaluated the performance of different methods in discovering ligand-receptor pairs. Based on Fig. 4b, the performance of VISTA was comparable to gimVI and both had promising scores. Based on Fig. 4c, VISTA ranked 2nd in the comparison. Based on Fig. 4d, VISTA outperformed Tangram, SpaGE, and TransImp for the seqFISH-embryo dataset. Moreover, Tangram did not perform well across all three datasets, suggesting its limited performance in identifying ligand-receptor pairs. To explore the ligand-receptor pairs discovered in the imputation results by VISTA qualitatively, we display the bubble heatmaps for the Xenium-breast dataset in Fig. 4 (g). These identified pairs may piece together complex cellular communication, and lead to a deeper understanding of the functional organization of tissues and potentially highlight targets for therapeutic intervention. For example, we discovered specific highly-expressed pairs KDR-PECAM1^49,50, KDR-CDH5⁵¹, and CD34-SELL⁵² between immune cells and malignant tumor cells, thus these genes play an important role for the control of cancers. Overall, the evaluation of VISTA in the context of ligand-receptor pair discovery underscores its potential as a tool for elucidating cell-cell interactions within various biological systems. We also explored the difference between treating the imputed data as scRNA-seq data and treating the imputed data as spatial transcriptomic data, with the results summarized in Supplementary Note 3.

Supplementary Fig. 7b shows the average rank of metrics we used in this section. VISTA still outperformed the rest of the methods in three out of four datasets. Therefore, the results imputed by VISTA can contribute to various downstream applications.

VISTA facilitates RNA velocity analysis and signaling direction inference by imputing dynamic properties of genes

Here, we explored spatial RNA velocity inference based on spatial transcriptomic data^53,54. By replacing the embedding space with spatial location, we can uncover the relation between RNA velocity and spatial distribution by imputing unobserved spliced gene expression and unspliced gene expression. In this section, we analyzed another Xenium-brain dataset sequenced by HybISS⁵⁵.

We examined the capacity of VISTA to deduce signaling directions within cell-cell communication, a critical aspect of understanding interactions across various cell states. We employed COMMOT⁵⁶ to calculate these signaling directions. Figure 5a contrasts the inferred signaling directions from spatial data, both pre- and post-imputation. The left panel of the figure indicates that prior to imputation, the signaling directions computed by COMMOT lacked significance. This was exemplified by the FGF signaling direction, which did not exhibit any discernible pattern in this figure. Post-imputation results, however, were markedly more substantial. The center panel reveals a clear signaling direction for CXCL, as discerned from the imputed data. Similarly, the right panel highlights the ANGPTL signaling direction, bolstering the same inference. Thus, the ability to impute missing gene expressions significantly enhances the detection and analysis of signaling directions, yielding more pronounced and interpretable results.

**Fig. 5: Analysis of signal direction and spatial variation.**

Referencing Fig. 6a, we conducted a comparative analysis of post-imputation results between TransImp and VISTA, using the RNA data from the scRNA-seq dataset as ground truth. Given that our imputation was derived from the scRNA-seq dataset, we anticipated that the spliced and unspliced RNA profiles in our spatial data would mirror those in the reference dataset. VISTA adeptly conserved the splicing information, whereas TransImp disrupted the expected proportional relationship between spliced and unspliced RNA among different cell types, and inverted the accurate proportion on a global scale. This comparative error analysis is illustrated in Fig. 6b, suggesting that the imputation of TransImp is unsuitable for inferring RNA velocity. This is further substantiated by the analyses presented on the left of Fig. 6c, where RNA velocity was computed for both imputation methods using scVelo⁵⁷. The red circle highlights the outcomes from TransImp, where an incongruous convergence of RNA velocity in mature neuronal cells is evident. In contrast, the RNA velocity inferred from VISTA aligns more closely with biological expectations, where mature neural cells are developed from immature ones and there is no clear convergence of RNA velocity in mature cells. Additionally, by employing VeloVI⁵⁸, we reassessed RNA velocity to ascertain the level of uncertainty, as depicted in Fig. 6d. Notably, most cells demonstrated low intrinsic uncertainty versus higher extrinsic uncertainty, indicating that while the magnitude of RNA velocity inferred from imputation is stable, its directional consistency is not.

**Fig. 6: Analysis of spatial RNA velocity inference.**

VISTA facilitates knowledge transfer from single-cell scope to spatial scope

In this section, we extend the capabilities of VISTA to include the transfer of perturbation effects from scRNA-seq to spatial datasets, and thus we could simulate spatial perturbation datasets with different conditions. To simulate the perturbation case, we decoded the gene expression values using latent space from spatial data and decoder part for scRNA-seq data. To evaluate simulation results, we selected real datasets with targeted perturbation effect and computed the correlation between simulated expression profiles and real expression profiles. Firstly, we explored the ability of VISTA to transfer the gene expression levels with diseased information as the special case of perturbation from reference scRNA-seq⁵⁹ to target spatial data⁶⁰. We treated the scRNA-seq dataset and the spatial dataset as paired datasets. In total, we have four controlled datasets and four diseased datasets with different ages, thus we separated them into two groups. Here we considered two controlled datasets (one from an 8-month-old mouse, the other from a 13-month-old mouse) and two datasets with Alzheimer’s disease (AD) (one from an 8-month-old mouse, the other from a 13-month-old mouse). We utilized the diseased scRNA-seq dataset as a reference dataset and trained VISTA to transfer the spatial data with the controlled case into the diseased case.

Supplementary Figs. 12 and 13 show the results for the 8-month dataset and 13-month dataset, respectively. Supplementary Figs. 12a and 13a show the cell-type distribution in the spatial space for AD datasets and control datasets, and we found obvious cell-type overlap between paired datasets. To investigate the correctness of the transferred information based on VISTA, we visualized the overlapping top 10 DEGs from paired datasets with two cases for 8m and 13m in Supplementary Figs. 12b and 13b. For the overlapped DEGs, the generated AD dataset showed similar patterns compared with the real AD dataset. To perform quantitative comparisons, we computed the differences in gene expression (Δexp) between the spatial dataset with the AD condition and the spatial dataset with the control condition based on Microglia (Micro) cells. Supplementary Fig. 12c shows the Δexp calculated based on the imputed 8m AD dataset and the observed 8m AD dataset. The SCC between these two arrays of Δexp was 0.41 (p-value=1.1e−104). The observation is similar based on Supplementary Fig. 13c for the 13m AD dataset. Therefore, VISTA can successfully simulate the spatial dataset with the AD case. For the other group, we had similar conclusions from Supplementary Fig. 14 and Supplementary Fig. 15.

Moreover, we also considered large-scale spatial data ( ~800,000 cells)⁶¹ sequenced from lung tissue with cancer using MERFISH, as well as large-scale scRNA-seq data ( ~200,000 cells)⁶² with diseased information. Supplementary Fig. 16a shows the cell-type distribution of the spatial data we used, with clear spatial patterns for Epithelial cells and Fibroblasts cells. After the imputation by VISTA, the specific clustering patterns of these two cells were also preserved, shown in Supplementary Fig. 16b. Therefore, using the gene expression information after imputation, we could identify more spatial-induced genes for cells from lung cancer samples. The DEG identification after imputation is shown in Supplementary Fig. 16c. This discovery might contribute to the research of in-silico treatment⁶³ for lung cancer.

Utilizing a perturbed mouse brain dataset derived from perturb-seq⁶⁴ as the scRNA-seq reference, we further assessed the fidelity of transferring perturbation signatures to the previously utilized osmFISH-brain spatial dataset. As depicted in Supplementary Fig. 17a, the low expression levels of the perturbed gene Ank2 demonstrate the retention of perturbation effects through our transfer process. Additionally, Supplementary Fig. 17b indicates a reduction in the disparity of cell-type distributions post-imputation, suggesting a spatial manifestation of gene perturbation effects. To quantitatively analyze the results we obtained, we selected the genes from the IN1 module, which is affected by the perturbation of gene Ank2⁶⁴, and computed the correlation between the gene-gene correlation coefficients of the imputed spatial dataset and reference scRNA-seq dataset. In Supplementary Fig. 17c, we visualize the correlation with their SCCs. We showed that our imputation process may transfer the perturbed information from reference scRNA-seq data into target spatial data. Further analysis of expression changes in genes common to both datasets, illustrated in Supplementary Fig. 17c, d, confirms the impact of perturbations on gene expression patterns.

VISTA imputes spatial transcriptomic data for spatial RNA velocity inference

For scRNA-seq datasets with spliced and unspliced RNA information, we could simulate the spatial transcriptomic data with perturbation by using dynamo⁶⁵. We imputed the corresponding spatial data with both spliced gene expression and unspliced gene expression. We then computed the in-silico perturbation using the RNA velocity from spatial data. We illustrate the dynamic relation of different neural cell types in Supplementary Fig. 18a, and the dynamic relation corresponded to the differential stages of cells^55,66,67,68 showed strong signals. Therefore, we had biological support for the in-silico perturbation inference. We show the perturbation results by suppressing Gata1 in Supplementary Fig. 18b and Spi1 in Supplementary Fig. 18c. From these figures, we found that the trajectory was changed in this process and detected fewer trajectory patterns. The activation of these two genes plays an important role in the development of the nervous system^69,70, suggesting the potential utility of our computation results. Since there are no available datasets for spatial transcriptomic sequencing with perturbation⁷¹ as far as we know, our experiments posed two approaches to simulate spatial data under different perturbations for possible downstream analysis⁷².

Discussion

Spatially resolved transcriptomics at the single-cell level afford the exploration of cell types and gene patterns across various spatial scales. Yet, the breadth of such analysis was previously constrained by the number of analyzable genes and the cell capture capacity. With the advent of Xenium, we can concurrently sequence numerous cells, circumventing the cell limitation issue.

Existing models provide useful baselines but face major scalability and efficiency limitations on large spatial datasets. SpatialScope encounters memory issues, while gimVI cannot integrate spatial context and suffers from OOM errors on Xenium data due to inefficient architecture⁷³. Tangram’s full-batch mode causes OOM, and its mini-batch variant optimizes locally and relies on temporary file storage⁷⁴. SpaGE’s linear regression in PCA space fails to capture complex spatial patterns and depends on CPU computation, reducing speed. TransImp improves imputation accuracy but cannot scale or quantify uncertainty. In the Xenium era, the key challenges are accurate imputation and high computational efficiency.

VISTA is a robust solution capable of overcoming the challenges that its predecessors face. Its unique design allows us to handle both small-scale and large-scale spatial data, offering superior performance in various scenarios, including those that require uncertainty quantification. The scalability of VISTA is enhanced by its compatibility with GPU acceleration and its efficient mini-batch training approach, making it a viable option for a wide range of academic research settings due to its reasonable run times on contemporary hardware. The generative model at the core of VISTA supports uncertainty quantification through sampling techniques.

Our investigation extended to various downstream applications utilizing imputed results. Firstly, we identified marker genes and DEGs after imputation for spatial data with cell-type annotation. Secondly, we discovered genes with spatially expressed patterns by finding SV genes. We also discovered ligand-receptor pairs after imputation. Thirdly, we demonstrated the power of VISTA for imputing spliced genes and unspliced genes for spatial data and used the imputation results to infer spatial RNA velocity. Moreover, we also analyzed the signal direction of some pathways to figure out significant patterns related to cell-cell communication. We also discussed the influence of imputation on variation decomposition for spatial data and discovered genes with more significant associations with spatial variation. Finally, we explored possible knowledge transfer from the scRNA-seq domain to the spatial domain. Therefore, the imputation through VISTA offers biological discoveries in different dimensions.

Our flexible input feature design enables the imputation of additional modalities, such as Antibody-derived Tags (ADTs)⁷⁵ and chromatin accessibility data from single-cell ATAC sequencing (scATAC-seq)⁷⁶. The versatility of VISTA is further illustrated by its applicability to different datasets of similar tissues, a process facilitated by transfer learning⁷⁷. Although we have demonstrated the superiority of VISTA with various experiments, there is room for further improvement. For example, the training of VISTA based on atlas-level data is not stable and requires methods such as gradient clipping to reduce large gradients, and we also did not fully discuss the criteria to select a good reference scRNA-seq dataset for spatial transcriptomics data with different resolutions. In the future, we will consider developing a more unified tool to perform imputation.

Methods

Problem definition

Considering we have gene expression profiles X^N,G and $X^{\prime} {N}^{\prime} ,{G}{^{\prime}}$ to represent the data from scRNA-seq protocols and the data from spatial transcriptomic protocols, where these two profiles came from the same tissue and ${G}^{{\prime} }$ is a subset of G. Our target is to learn a model ${\mathcal{M}}$, which can impute the unobserved genes in spatial transcriptomic data as $X{^{\prime} }{{N}{^{\prime} },{G}{^{\prime} }}={\mathcal{M}}(X^{\prime} )$. We are also interested in estimating the corresponding uncertainty levels for each cell and each gene. The related downstream applications after imputation are explained later in the Methods section.

Anchor genes pre-processing

Since we observed that not all the shared genes from scRNA-seq data and SST data are strongly correlated, and some may even have a negative PCC based on Supplementary Figs. 19 (a)-(c), a filtering process is needed for both performance and reliability improvement. Here we consider two vectors representing shared cell-type labels as c^M,1 from these two datasets. M represents the number of shared cell-type labels. We average cells from the same cell types by measurement protocols to generate the pseudo-bulk samples X^M,G and $X{{\prime} }^{M,{G}{^{\prime} }}$. By assuming ${G}^{{\prime} }\subset G$, we take the common genes and compute the PCC for each gene $g\in {G}^{{\prime} }$, to extract the correlation and significance for each gene:

$$\forall {{g}^{\prime}} \in {{\mathcal{G}}}^{\prime} ; r,p=\,{\rm{Pearson}}\; {\rm{Correlation}} \,({{X}^{M,g}}, {{X}^{{\prime} {M,g}}}).$$

(1)

The output variable r represents correlation and p represents p value. We then filter genes with r ≤ 0.5 or p ≥ 0.05 by our default design. The remaining genes are treated as anchor genes and used for constructing the updated ${G}^{{\prime} }$. We also tested the performances of Spearman correlation but we did not recommend it because the Spearman correlation coefficients of genes are almost all non-significant (p-value > 0.05), which substantially decreased the size of training dataset and introduced false-negative bias. If users do not intend to include cell types for both reference scRNA-seq data and SST data, they can skip this step.

Generative model of VISTA

The design of this section is inspired by Lopez et al.¹⁹ as a Variational Auto-encoder (VAE)⁷⁸. We assume the following generative process to model the common components of gene expression of scRNA-seq as x_mg and the gene expression of spatial transcriptomic data ${x}_{n{g}^{{\prime} }}^{{\prime} }$:

$${z}_{n} \sim {\mathcal{N}}\left(0,{I}_{d}\right), \hfill\\ {\rho }_{n} = {f}^{\eta }({z}_{n},{s}_{n}),\\ {\ell }_{n} \sim \log {\rm{Normal}}\,\left({\mu }_{n},{\sigma }_{n}^{2}\right).$$

(2)

Here we use z_n to represent the shared biology component between the two protocols, and I_d represents an identity matrix with size (d, d). For each cell n, we have a binary variable as s_n to specify whether the cell is captured by scRNA-seq or spatial experimental protocols. Let G donate the set of genes captured by scRNA-seq, and ${G}^{{\prime} }$ donate the set of genes captured by spatial transcriptomic data. We assume ${G}^{{\prime} }\subset G$. We denote a neural network f^η with parameters η to model the probability simplex by generating a variable ρ_n. Such variables can be interpreted as the normalized frequencies of each gene g of one cell. ℓ_n is used to model the log-library size of scRNA-seq data with mean μ and variance ${\sigma }_{n}^{2}$. Correspondingly, we have ${\ell }_{n}^{{\prime} }$ to model the log-library size of spatial transcriptomic data. Since the technical variation of spatial data is not large, ${\ell }_{n}^{{\prime} }$ is not a random variable but the number of original gene transcripts in a given cell.

To model the gene expression from scRNA-seq measurement, we treat the observed gene expression for each cell is conditioned on variable set {ℓ_n, z_n, s_n} and the gene expression can be sampled based on overdispersed count data distributions:

$${x}_{ng} \sim \left\{\begin{array}{l}{\rm{ZINB}}\left({\ell }_{n}{\rho }_{ng},{\theta }_{g},{f}_{g}^{\nu }\left({z}_{n},{s}_{n}\right)\right),\\ \,\text{or}\\ {\rm{NB}}\left({\ell }_{n}{\rho }_{ng},{\theta }_{g}\right),\hfill\end{array}\right.$$

(3)

where ZINB represents zero-inflated negative binomial distribution, and NB represents negative binomial distribution. θ_g represents a vector of gene-specific inverse dispersion parameters. Such options are provided by gimVI and we model the gene expression data by ZINB in this manuscript.

To model the gene expression from spatial experimental protocols, we first re-normalize the expressed gene frequencies based on:

$$\forall {g}^{\prime} \in {{\mathcal{G}}}^{\prime} ,{\rho }^{\prime}_{n{g}^{\prime} } =\frac{{\rho }_{n{g}^{\prime} }}{{\sum }_{g\in {\mathcal{G}}}{\rho }_{ng}},$$

(4)

because we subset the genes from ${G}^{{\prime} }$. Therefore, we can model the gene expression ${x}_{n{g}^{{\prime} }}^{{\prime} }$ also based on $\{{\ell }_{n}^{{\prime} },{z}_{n},{s}_{n}\}$ as

$${x}_{n{g}^{{\prime} }}^{{\prime} } \sim {\rm{NB}}\left({\ell }_{n}^{{\prime} }{\rho }_{n{g}^{{\prime} }}^{{\prime} },{\theta }_{{g}^{{\prime} }}^{{\prime} }\right),$$

(5)

where θ_g represents a vector of gene-specific inverse dispersion parameters. Such design follows the modeling process of gimVI.

Posterior inference and graph neural network construction of VISTA

The design of this section is inspired by¹⁹, and we utilize a Bayesian inference approach conditioned on spot neighbors to optimize the parameters in the given model. During the training process, to handle the large-scale datasets, we implement an approach based on node sampling to generate neighborhood graphs for each cell in the given batch. For a batch with batch size n, we sample n cells from the total spatial data and compute the neighbor graph based on these cells. That is,

$${c}_{n} = NodeSample(\{{c}_{N}\}),\\ {e}_{n} = ExtractGraph({c}_{n},{{\mathcal{N}}}_{n}),$$

(6)

where ${{\mathcal{N}}}_{n}$ represents the neighbor sets of all the cells in c_n. We implement the edge generation process based on FAISS⁴³ for acceleration.

For the data from two different protocols, we can decompose the objective function as

$$L=\mathop{\sum }_{\begin{array}{c}n=1\\ {s}_{n}=0\end{array}}^{N}\log {p}_{\Theta }\left({x}_{n}| {s}_{n}\right)+\mathop{\sum }_{\begin{array}{c}n=1\\ {s}_{n}=1\end{array}}^{N}\log {p}_{\Theta }\left({x}_{n}^{{\prime} }| {e}_{n},{s}_{n},{\ell }_{n}^{{\prime} }\right),$$

(7)

where N represents the total number of samples, and p_Θ represents the distribution under different conditions. The parameters we intend to optimize in the generative model are represented as $\Theta =\{\eta ,\nu ,\theta ,\theta {\prime} \}$. Since it is hard to perform exact posterior inference here, we utilize variational inference to access the posterior distribution.

Considering we have two distributions ${q}_{{\phi }_{1}}$ based on scRNA-seq data and ${q}_{{\phi }_{2}}$ based on spatial data. We can first encode the cell-level embeddings z based on a non-linear MLP ${F}_{{\theta }_{C}}(x)$, and thus we have

$$z={F}_{{\theta }_{C}}(x).$$

(8)

Furthermore, we intend to learn the joint latent space for these two modalities and for data from spatial transcriptomics, so we also connect a Graph Attention Network (GAT)⁷⁹ to capture the neighbor information for spatial transcriptomic data. Considering i and j are two indices of neighbor spots in the training batch, the computation can be represented as

$${\widehat{h}}_{i} = W{F}_{{\theta }_{S}}({x}^{{\prime} }),\\ {e}_{ij} = {\rm{LeakyReLU}}\left({{\bf{a}}}^{T}[\,{\widehat{h}}_{i}\,\parallel \,{\widehat{h}}_{j}]\right),\\ {\alpha }_{ij} = \frac{\exp ({e}_{ij})}{{\sum }_{k\in {\mathcal{N}}(i)\cup \{i\}}\exp ({e}_{ik})},\\ {z}_{i}^{{\prime} } = {\rm{ReLU}}\left({\sum }_{j\in {\mathcal{N}}(i)\cup \{i\}}{\alpha }_{ij}\,{\widehat{h}}_{j}\right),$$

(9)

where ${F}_{{\theta }_{S}}$ is a non-linear MLP to learn the initial embeddings, W and a are learnable parameters, LeakyReLU, ReLU are the default activation functions for GAT, exp is the function of exponential computing, and ${\mathcal{N}}(i)$ represents the set of i_th spot’s neighbors. The final updated ${z}^{{\prime} }$ represents the spot-level embeddings with spatial context.

Therefore, to formalize the joint distribution, we have

$${q}_{{\phi }_{1}}(z,\ell | x,s) = {q}_{{\phi }_{1}}(z| x,s){q}_{{\phi }_{1}}(\ell | x,s),\\ {q}_{{\phi }_{2}}({z}_{s},{\ell }^{{\prime} }| {x}^{{\prime} },s) = {q}_{{\phi }_{2}}({z}_{s}| {x}^{{\prime} },s),{z}_{s}=z+{z}^{{\prime} },$$

(10)

where these distributions are Gaussian with diagonal covariance matrices, and we can utilize a neural network to generate the parameters of these two distributions, which has been discussed above. Here e represents the edges of the neighbor graphs of samples in z. This setting can also be treated as a version of the mixture-of-expert (MoE) design with one expert focusing on scRNA-seq and the other expert focusing on spatial transcriptomic data⁸⁰. The evidence lowerbound (ELBO) of these two modalities are

$$\log {p}_{\Theta }(x| s) \ge {{\mathbb{E}}}_{{q}_{{\phi }_{1}}(z,\ell | x,s)}\log {p}_{\Theta }(x| z,s,\ell )\\ -{D}_{KL}\left({q}_{{\phi }_{1}}(\ell | x,s)\parallel p(\ell )\right)\\ -{D}_{KL}\left({q}_{{\phi }_{1}}(z| x,s)\parallel p(z)\right),\\ \log {p}_{\Theta }\left({x}^{{\prime} }| e,{\ell }^{{\prime} },s\right) \ge {{\mathbb{E}}}_{{q}_{{\phi }_{2}}\left({z}_{s}| {x}^{{\prime} },s\right)}\log {p}_{\Theta }\left({x}^{{\prime} }| {z}_{s},s,{\ell }^{{\prime} }\right)\\ -{D}_{KL}\left({q}_{{\phi }_{2}}({z}_{s}| {x}^{{\prime} },s)\parallel p({z}_{s})\right).$$

(11)

Here e still represents edges. We then optimize the neural networks to obtain the best model, and the aggregation between the latent space as input and the output of GAT is inspired by residual learning⁸¹. From the domain adaption theory given by Blitzer et al.⁸², we also refer to the implementation of gimVI and add an adversarial classifier for reducing the distribution shift between spatial data and scRNA-seq data. Reliably imputed genes should be either well fitted by VISTA or well fused in the joint latent space.

We show the contribution of MLP and GNN in Supplementary Figs. 20a, b. Supplementary Fig. 20a demonstrates that the embeddings generated by MLP can capture more cell-type-specific information, while cell embeddings generated by GNN aggregate spatial information and show strong clustering signals for certain cell types. Moreover, Supplementary Fig. 20b demonstrates that introducing information disentanglement for expression signals and spatial signals do not help on learning a better representation, which is possibly caused by cellular heterogeneity.

For the missing gene imputation step, we consider a gene ${g}_{i}\in G-{G}^{{\prime} }$, which is missing in the spatial measurement. We sample the variable z based on the posterior distribution ${p}_{\Theta }(z| {x}^{{\prime} },s=1)$ and we use our neural networks to compute the missing gene expression ${x}_{i}^{{\prime} }(z)$ for all the cells in the spatial domain. The mixture model of VISTA is shown in Supplementary Fig. 21.

Uncertainty estimation

We offer the uncertainty estimation at both the cell level and the gene level. Both estimation processes are based on the distribution of the generative model: ${x}_{n{g}^{{\prime} }}^{{\prime} } \sim {\rm{NB}}({\ell }_{n}^{{\prime} }{\rho }_{n{g}^{{\prime} }}^{{\prime} },{\theta }_{{g}^{{\prime} }}^{{\prime} })$ and we denote ${\ell }_{n}^{{\prime} }{\rho }_{n{g}^{{\prime} }}^{{\prime} }={\bar{v}}_{ng}$, where ${\bar{v}}_{ng}$ represents the mean of our NB distribution. The estimation process is inspired by VeloVI⁵⁸.

For the gene-level uncertainty estimation, we sample L examples for all the cells and gene g. Therefore, the uncertainty for this gene, known as δ_g, can be defined as follows:

$${\delta }_{g}=1-{\rm{median}}\left({\left\{\frac{{v}_{jg}\cdot {\bar{v}}_{g}}{\left\Vert {v}_{jg}\right\Vert \left\Vert {\bar{v}}_{g}\right\Vert }\right\}}_{j=1}^{L}\right).$$

(12)

Here v_jg represents the jth sample we generate from our output distribution, and we compute the median value of the cosine similarity between the sampled value and mean value. We set $L=\min (\frac{{N}^{{\prime} }}{10},100)$ if we have ${N}^{{\prime} }$ cells from spatial data. We use gene-level uncertainty to determine which genes we can trust after imputation, and we select genes whose uncertainty is lower than the median value for comparison in our experiments.

Similarly, for the cell-level uncertainty estimation, we sample L cells for cell c and all genes. Therefore, the uncertainty of this cell, known as δ_c, can be defined as follows:

$${\delta }_{c}=1-{\rm{median}}\left({\left\{\frac{{v}_{c}^{j}\cdot {\bar{v}}_{c}}{\left\Vert {v}_{c}^{j}\right\Vert \left\Vert {\bar{v}}_{c}\right\Vert }\right\}}_{j=1}^{L}\right).$$

(13)

We also set $L=\min (\frac{{N}^{{\prime} }}{10},100)$ if we have ${N}^{{\prime} }$ cells in spatial data.

Experiment design

In the benchmarking analysis of imputation accuracy for the four datasets, we split the overlap genes between scRNA-seq datasets and spatial datasets into training datasets (approximately 70% of the genes; among these genes, 10% genes are used for validating) and testing datasets (approximately 30% of the genes)⁸³. Different methods have their specific design for splitting training datasets into training datasets and validation datasets. Our graph is constructed based on the spatial locations. The encoder for gene expression profiles is MLP, while the encoder for spatial information is GAT. The default optimizer is Adam. To reduce the memory usage, we implement the adaptive neighbor size selection for graph construction and adaptive mini-batch selection for model training. These two hyper parameters are related to memory limit and performance optimization.

To study the stability, we changed the random seeds from 0 to 9 for every method during the experiments. Then we adjust the hyper-parameters of different methods to achieve their best performance in each dataset. The hyper-parameter adjustment information can be found in Supplementary Note 1. In the benchmarking analysis of biological functions, we impute all the missing genes of spatial data based on different methods and perform evaluation.

Baseline models explanation

In our benchmarking process, we consider seven baseline methods in total for comparison, including gimVI, Tangram, SpaGE, TransImp, ENVI, SpatialScope, and SpaIM. These methods are randomly ordered. We also included TISSUE for benchmarking the performance of uncertainty estimation.

gimVI¹⁹ also models the gene expression from these two modalities into a joint latent space and uses variational inference to generate the output distribution and impute the expression levels of missing genes. However, gimVI does not consider the neighborhood relation in the spatial data and its implementation is not efficient, as shown in our results. Moreover, gimVI meets OOM errors in our benchmarking process.

Tangram²⁴ is a model based on optimal matching. Tangram learns the best match relation between scRNA-seq data and spatial data by learning a mapping function and then performs the imputation based on minimizing the loss of such mapping function. However, Tangram is not scalable and the batch version of Tangram does not consider the difference between local optimal solutions and global optimal solutions.

SpaGE¹⁷ is a model based on dimension reduction and regression. It firstly reduces the high dimensions of the input data, and in the joint low-dimensional space, it trains a regression model to impute the value of missing genes. However, SpaGE is not efficient for Xenium-based data with moderate performance.

TransImp²⁸ is a model based on regression and spatial information regularization. It also relies on dimension reduction for the first step, and in the regression step, it considers both minimizing the loss between predicted data and ground truth data and minimizing the difference of spatial information. However, TransImp meets OOM errors in our benchmarking process and its imputation results lead to poor performance for some downstream applications, for example, RNA velocity inference.

ENVI²⁰ is a model based on conditional auto-encoder. It learns the embeddings of scRNA-seq data and multiplexed spatial transcriptomic data simultaneously and decodes the embeddings to expression space for imputation. However, ENVI meets OOM issues in our benchmarking process.

SpatialScope²¹ is a model based on diffusion model for different tasks related to spatial data. The authors pre-train a generative model based on scRNA-seq data to have a checkpoint. Then they load the model from the checkpoint to fine-tune it based on both scRNA-seq data and spatial data to fill the missing gene expression levels of spatial data based on modelling joint distribution. SpatialScope meets OOM issues in all the datasets we used.

TISSUE²⁹ is a framework for evaluating the reliability of imputation results. TISSUE treats the training dataset as a calibration dataset and computes the uncertainty measurement score based on the spatial neighbors of the calibration dataset. The final lower and upper prediction errors can be estimated based on the distribution of calibration scores. In our experiments, we only considered the default and recommended version of TISSUE based on SpaGE.

SpaIM³⁷ performs imputation by learning the styles of spatial data and scRNA-seq data, and then transfers the style from scRNA-seq data to the spatial domain and performs imputation.

The contribution of imputation for advancing spatial transcriptomics analysis

By imputing the unmeasured genes’ expression levels based on VISTA, we are capable of gaining more information to better analyze a spatial transcriptomic dataset and make biological discoveries, respectively. In this manuscript, we consider several biological applications which can only be performed after imputation. The first application is the discrimination of cell-type-specific information. Since imputation can help us detect more expressible marker genes for a given cell type, we expect that we are able to better distinguish the characteristics of different cell types through clustering. Secondly, we can discover more cell-cell interactions by performing statistical tests based on more expressed genes, and most of the existing CCIs before imputation can be preserved. Thirdly, observing more expressed genes can also help us better characterize variation contributed by biological signals and spatial signals and decompose them. Finally, imputation allows us to model the dynamic change of gene expression across both time and space. If we utilize a scRNA-seq dataset with spliced and unspliced gene expression information, we can impute a spatial transcriptomic dataset with VISTA and perform RNA velocity analysis based on the given spatial dataset. Moreover, if the scRNA-seq data contain perturbation information, we can also perform in-silico perturbation for a spatial transcriptomic dataset with imputation. This application is a demonstration of VISTA’s transfer learning capacity. The details of the applications discussed above are presented in the following sections.

Metrics explanations

Our evaluation analysis involves both statistical metrics and biological application metrics. The statistical metrics are based on evaluation for correlation or distribution comparison. The biological application metrics are based on evaluation for the discovery of spatially variable genes and cell-cell interactions.

Inspired by Li et al.²⁵, we consider four metrics as statistical metrics, including Spearman correlation coefficients (SCC), structural similarity index measure (SSIM), root mean square error (RMSE), and Jensen-Shannon divergence (JS).

1. SCC. The SCC is defined as

$$SCC=\frac{{\rm{cov}}({\rm{R}}({x}_{g}),{\rm{R}}({\widetilde{x}}_{g}))}{{\sigma }_{{\rm{R}}({x}_{g})}{\sigma }_{{\rm{R}}({\widetilde{x}}_{g})}},$$

(14)

where for gene g, we measure the covariance for the rank of ground truth gene expression x_g and predicted gene expression ${\widetilde{x}}_{g}$ across all the samples. σ_* donates the standard deviation for the given ranks. Higher SCC represents better imputation performance, and SCC ∈ ( − 1, 1).

2. SSIM³⁸: Firstly, we scale the gene expression matrix to (0,1) based on:

$${x}_{ij}^{{\prime} }=\frac{{x}_{ij}}{\max \left(| \left\{{x}_{i1},\ldots ,{x}_{iN}\right\}| \right)},$$

(15)

where x_ij donates the gene expression of gene i in cell j, and N represents the total number of cells. SSIM is computed as

$$SSIM=\frac{\left(2{\widetilde{u}}_{i}{u}_{i}+{C}_{1}^{2}\right)\left(2{\rm{cov}}\left({{\boldsymbol{x}}}_{i}^{{\prime} },{\widetilde{{\boldsymbol{x}}}}_{i}^{{\prime} }\right)+{C}_{2}^{2}\right)}{\left({\widetilde{u}}_{i}^{2}+{u}_{i}^{2}+{C}_{1}^{2}\right)\left({\widetilde{\sigma }}_{i}^{2}+{\sigma }_{i}^{2}+{C}_{2}^{2}\right)},$$

(16)

where u_i and ${\widetilde{u}}_{i}$ donate the average expression of gene i across all the cells. σ_i and ${\widetilde{\sigma }}_{i}$ donate the standard deviation of the expression of gene i across all the cells. C₁ and C₃ are set a 0.01 and 0.03 based on. Higher SSIM represents better imputation performance and SSIM ∈ (0, 1).

3. RMSE. Before computing RMSE, we calculate the normalized score (z-score) for each gene i across all the cells, that is:

$${z}_{ij}=\frac{{x}_{ij}-{u}_{i}}{{\sigma }_{i}}.$$

(17)

Therefore, for N cells, the RMSE is defined as:

$$RMSE=\sqrt{\frac{1}{N}\mathop{\sum }\limits_{j=1}^{N}{\left({\widetilde{z}}_{ij}-{z}_{ij}\right)}^{2}},$$

(18)

where ${\widetilde{z}}_{ij}$ and z_ij denote the predicted z-score and ground truth z-score for gene i in cell j. Lower RMSE represents better imputation performance, and RMSE ∈ (0, ∞).

4. JS. To compute JS, we first define the Kullback–Leibler (KL) divergence for two probability distributions a_i and b_i as:

$$KL\left({{\boldsymbol{a}}}_{i}\parallel {{\boldsymbol{b}}}_{i}\right)=\sum_{j=0}^{N}\left({a}_{ij}\times \log \frac{{a}_{ij}}{{b}_{ij}}\right),$$

(19)

where a_ij and b_ij represent the probabilities of gene i in cell j. For gene i, if we have P_i and ${\widetilde{{\boldsymbol{P}}}}_{i}$ to represent the probability vector of the spatial distribution using ground truth gene expression and predicted gene expression across all the cells, we then can define JS as:

$$JS=\frac{1}{2}KL\left({\widetilde{{\boldsymbol{P}}}}_{i}| \frac{{\widetilde{{\boldsymbol{P}}}}_{i}+{{\boldsymbol{P}}}_{i}}{2}\right)+\frac{1}{2}KL\left({{\boldsymbol{P}}}_{i}| \frac{{\widetilde{{\boldsymbol{P}}}}_{i}+{{\boldsymbol{P}}}_{i}}{2}\right).$$

(20)

For the evaluation of biological applications, we consider three cases: 1. Evaluating clustering ability; 2. Identifying SV genes; and 3. Identifying cell-cell interaction pairs.

To evaluate the clustering ability, we use three metrics: NMI, ARI, and ASW for evaluation. We choose Louvain⁸⁴ as the clustering approach based on the imputation results and treat the original annotation of cell types as ground truth.

1. Normalized Mutual Information (NMI): We calculate NMI score based on computing the mutual information between the optimal Louvain clusters and the known cell-type labels and then take the normalization. Therefore, NMI ∈ (0, 1) and higher NMI means better performance.

2. Adjusted Rand Index (ARI): We calculate ARI score by measuring the agreement between optimal Louvain clusters and cell-type labels. Therefore, ARI ∈ (0, 1) and higher ARI means better performance.

3. Average Silhouette Width (ASW): Here we only consider ASW for cell types. For one cell point, ASW calculates the ratio between the inner cluster distance and the intra cluster distance for this cell. Therefore, higher ASW_cell means better biological information preservation. For ASW_cell, we take the normalization, that is:

$$AS{W}_{cell}=\frac{AS{W}_{cell}^{raw}+1}{2}.$$

ASW ∈ (0, 1), higher ASW means better performance.

Here we consider two approaches to identify SV genes, known as Moran’s I score⁴⁵ and SpatialDE²⁶ score.

1. Moran’s I score. Moran’s I statistics can test the auto-correlation for spatial data. For one gene, if the corresponding Moran’s I statistics is I_g, then we can compute the z_g score by normalization:

$${z}_{g} = \frac{{I}_{g}-E[{I}_{g}]}{\sqrt{V[{I}_{g}]}},\\ E[{I}_{g}] = \frac{-1}{n-1},\\ V[{I}_{g}] = E[{I}^{2}]-{(E[I])}^{2}.$$

(21)

Under the null hypothesis, z_g ~ N(0, 1). We then compute the ratio between the number of genes that are identified as SV genes by setting a threshold based on the p-value and the number of total genes. A higher ratio represents better performance. The scale of the ratio is [0,1].

2. SpatialDE score. For a gene g with expression profile y = (y_1g, . . . , y_Ng) across N cells in the spatial domain, we define the given spatial coordinates as O = (o₁, . . . , o_N). The model of SpatialDE is:

$$P\left({y}_{g}\left|\right. {\mu }_{g},{\sigma }_{g}^{2},\delta ,\Sigma \right)=N\left({y}_{g}| {\mu }_{g}\cdot 1,{\sigma }_{g}^{2}\cdot \left(\sum +\delta \cdot \left|\right.\right)\right),$$

(22)

where μ_g and σ_g represent the mean value and standard deviation value for the given gene, and we can determine δ based on optimization. The null hypothesis is $N({\mu }_{g},{\sigma }_{g}^{2})$, so we can perform the test and get the corresponding ratio after we have all the p-values. The ratio is defined same as Moran’s I score’s explanation. We also consider SpatialDE2 as a faster version accelerated by GPU cores for its estimation of the Gaussian process. A higher ratio represents better performance. The scale of the ratio is [0,1].

Here we consider three scores to identify cell-cell interaction pairs, which are known as CellPhoneDB²⁷, CellChat⁸⁵ and COMMOT⁵⁶.

1. CellPhoneDB score. We use CellPhoneDB to infer the cell-cell interactions for all the cells across cell types, suggested by⁸⁶. CellPhoneDB is based on a permutation test. We first shuffle the cell types for all the cells and take the average of known ligands and receptors in the random cell-type clusters under this case as the null distribution. Correspondingly, we have the observed distribution for the ligand-receptor pairs based on the original cell-type labels. Here also set a p-value threshold for all the discovered ligand-receptor pairs, and compute the ratio between the number of significant pairs and the number of total pairs. A higher ratio represents better performance. The scale of the ratio is [0,1].

The p-value threshold is set as 0.05 for all of the metrics we used in this section.

2. CellChat and COMMOT scores. We also use CellChat and COMMOT to investigate the influence of treating imputed data as different omics to the discovery of CCIs. CellChat and COMMOT both use the CellChat database. CellChat is based on a permutation test for the scRNA-seq dataset, while COMMOT uses the gene expression levels in the whole spots (cells) and distance threshold as criteria for spatial transcriptomic data. Here we use the same ratio we defined in the CellPhoneDB score section. The scale of the ratio is [0,1].

Application explanation

Besides the biological applications we have mentioned in the metrics explanation section, we also consider three applications of spatial data after imputation, including 1. RNA velocity/signaling direction inference, 2. Spatial effect (SE) identification, and 3. In-silico perturbation of spatial data. We do not have ground truth information for these three applications.

For RNA velocity inference, we select scRNA-seq data with spliced and unspliced gene expression as reference data to impute our target spatial data. We impute both spliced gene expression and unspliced gene expression for the spatial data and then run scVelo⁵⁷ and VeloVI⁵⁸ after imputation. The velocity is computed based on the spatial location.

For the signaling direction inference based on cell-cell communication, we impute the spatial datasets we have and generate the cell-cell signaling direction after imputation based on COMMOT.

For SE identification, we utilize SIMVI⁸⁷ to analyze the spatial data before imputation and after imputation. By analyzing the embeddings for spatial effect generated by SIMVI with cell types, we can discover cell types with spatial effect and further genes with spatial effect.

For in-silico perturbation, we consider three conditions. If we intend to simulate spatial data with the diseased condition, we utilize the reference scRNA-seq dataset with the diseased condition as input and generate the corresponding imputed spatial data. If we know the spliced and unspliced information of the reference scRNA-seq data, we use Dynamo to infer the RNA velocity of imputed spatial data and then compute the perturbation under certain genes. If the reference scRNA-seq data has been perturbed without spliced or unspliced information for RNA velocity inference, we treat the raw spatial data as input and utilize the output of the decoder belonging to scRNA-seq data to impute the spatial data with perturbation.

Data processing

To run VISTA, we keep the raw count data as input since we model the distribution of gene expressions based on the count data matrix.

For the analysis related to biological applications, we have different processing steps. For clustering analysis, we normalized and performed log1p transformation for the raw count data matrix and then selected the top 2000 highly variable genes (HVGs) for the evaluation of the clustering function. For SVG discovery and cell-cell communication discovery, we still use the raw count data. For the analysis of RNA velocity inference, we normalized and performed log1p transformation for the raw count data matrix and then selected the top 2000 HVGs. For the spatial effect identification, we still use the raw count data.

Statistics and reproducibility

All statistical analyses were performed with the Scanpy package in Python. Two independent groups’ comparisons were performed using the Wilcoxon rank sum test on metrics, with p-value < 0.05 considered statistically significant. For finding significant expressed genes across multiple groups or segments, we performed Wilcoxon rank sum test on each group to the union of the rest of the group. We used four paired datasets for benchmarking and cell-cell communication inference, one dataset for RNA velocity inference, and two datasets for spatial effect estimation.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

We did not generate sequencing datasets in this project. The download links and data statistics are summarized in the Supplementary Data 1. These datasets are cited as follows: osmFISH-brain (source:^6,36), xenium-breast (source:^10,41), xenium-brain (source:^10,41), seqfish-embryo (source:⁴²), starmap-mouse-brain (source:^60,88), hybiss-embryo (source:⁵⁵), and breast-cancer (source:⁶¹). All used datasets are correctly cited in the main text. Source data are provided in this manuscript (Supplementary Data 2). All the datasets are publicly available and have no ethical issues.

Code availability

To run VISTA, we relied on Yale High-performance Computing Center (YCRC) and utilized one NVIDIA A5000 GPU with up to 150 GB RAM. The upper bound of running time for analysis is 24 hours. The codes of VISTA can be found in https://github.com/HelloWorldLTY/VISTA (also in Zendo⁸⁹). The license is MIT license.

References

Hwang, B., Lee, J. H. & Bang, D. Single-cell rna sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 50, 1–14 (2018).
Article PubMed PubMed Central CAS Google Scholar
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 1–12 (2017).
Article Google Scholar
Regev, A. et al. The human cell atlas. elife 6, e27041 (2017).
Article PubMed PubMed Central Google Scholar
Williams, C. G., Lee, H. J., Asatsuma, T., Vento-Tormo, R. & Haque, A. An introduction to spatial transcriptomics for biomedical research. Genome Med. 14, 1–18 (2022).
Article Google Scholar
Shah, S., Lubeck, E., Zhou, W. & Cai, L. In situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron 92, 342–357 (2016).
Article PubMed PubMed Central CAS Google Scholar
Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmfish. Nat. Methods 15, 932–935 (2018).
Article PubMed CAS Google Scholar
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed rna profiling in single cells. Science 348, aaa6090 (2015).
Article PubMed PubMed Central Google Scholar
Eng, C.-H. L. et al. Transcriptome-scale super-resolved imaging in tissues by rna seqfish+. Nature 568, 235–239 (2019).
Article PubMed PubMed Central CAS Google Scholar
Janesick, A. et al. High resolution mapping of the tumor microenvironment using integrated single-cell, spatial and in situ analysis. Nat. Commun. 14, 8353 (2023).
Marco Salas, S. et al. Optimizing xenium in situ data utility by quality assessment and best practice analysis workflows. bioRxiv https://doi.org/10.1101/2023.02.13.528102 (2023).
Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).
Article PubMed PubMed Central Google Scholar
Rodriques, S. G. et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).
Article PubMed PubMed Central CAS Google Scholar
Zhang, D. et al. Spatial epigenome–transcriptome co-profiling of mammalian tissues. Nature 616, 113–122 (2023).
Article PubMed PubMed Central CAS Google Scholar
He, S. et al. High-plex imaging of rna and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat. Biotechnol. 40, 1794–1806 (2022).
Article PubMed CAS Google Scholar
Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).
Article PubMed PubMed Central CAS Google Scholar
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Article PubMed PubMed Central CAS Google Scholar
Abdelaal, T., Mourragui, S., Mahfouz, A. & Reinders, M. J. Spage: spatial gene enhancement using scrna-seq. Nucleic Acids Res. 48, e107–e107 (2020).
Article PubMed PubMed Central CAS Google Scholar
Shengquan, C., Boheng, Z., Xiaoyang, C., Xuegong, Z. & Rui, J. stplus: a reference-based method for the accurate enhancement of spatial transcriptomics. Bioinformatics 37, i299–i307 (2021).
Article PubMed PubMed Central Google Scholar
Lopez, R. et al. A joint model of unpaired data from scRNA-seq and spatial transcriptomics for imputing missing gene expression measurements. In ICML Comp Bio Workshop,Workshop Track Proceedings (2019).
Haviv, D. et al. The covariance environment defines cellular niches for spatial inference. Nat. Biotechnol. 43, 269–280 (2024).
Wan, X. et al. Integrating spatial and single-cell transcriptomics data using deep generative models with spatialscope. Nat. Commun. 14, 7848 (2023).
Article PubMed PubMed Central CAS Google Scholar
Cang, Z. & Nie, Q. Inferring spatial and signaling relationships between cells from single cell transcriptomic data. Nat. Commun. 11, 2084 (2020).
Article PubMed PubMed Central CAS Google Scholar
Moriel, N. et al. Novosparc: flexible spatial reconstruction of single-cell gene expression with optimal transport. Nat. Protoc. 16, 4177–4200 (2021).
Article PubMed CAS Google Scholar
Biancalani, T. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with tangram. Nat. Methods 18, 1352–1362 (2021).
Article PubMed PubMed Central Google Scholar
Li, B. et al. Benchmarking spatial and single-cell transcriptomics integration methods for transcript distribution prediction and cell type deconvolution. Nat. Methods 19, 662–670 (2022).
Article PubMed CAS Google Scholar
Svensson, V., Teichmann, S. A. & Stegle, O. Spatialde: identification of spatially variable genes. Nat. Methods 15, 343–346 (2018).
Article PubMed PubMed Central CAS Google Scholar
Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. Cellphonedb: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020).
Article PubMed CAS Google Scholar
Qiao, C. & Huang, Y. Reliable imputation of spatial transcriptome with uncertainty estimation and spatial regularization. Patterns (NY) 5, 101021 (2024).
Sun, E. D., Ma, R., Navarro Negredo, P., Brunet, A. & Zou, J. Tissue: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses. Nat. Methods 21, 444–454 (2024).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings (OpenReview.net, 2017) https://openreview.net/forum?id=SJU4ayYgl.
Liu, T., Wang, Y., Ying, R. & Zhao, H. Muse-gnn: Learning unified gene representation from multimodal biological graph data. Adv. Neural Inform. Processing Syst. 36 (2024).
Fey, M. & Lenssen, J. E. Fast Graph Representation Learning with PyTorch Geometric https://github.com/pyg-team/pytorch_geometric (2019).
Blei, D. M., Kucukelbir, A. & McAuliffe, J. D. Variational inference: A review for statisticians. J. Am. Stat. Assoc. 112, 859–877 (2017).
Article CAS Google Scholar
Dupont, E. Learning disentangled joint continuous and discrete representations. Adv. Neural Information Processing Systems 31 (2018).
Kingma, D. & Ba, J. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR) San Diego, CA, USA, Conference Track Proceedings (2015).
Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
Article PubMed CAS Google Scholar
Li, B. et al. SpaIM: Single-cell spatial transcriptomics imputation via style transfer. Nat. Commun. 16, 7861 (2025).
Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).
Article PubMed Google Scholar
McKellar, D. W. et al. Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration. Commun. Biol. 4, 1280 (2021).
Article PubMed PubMed Central CAS Google Scholar
Walter, L. D. et al. Transcriptomic analysis of skeletal muscle regeneration across mouse lifespan identifies altered stem cell states. Nat. Aging 4, 1862–1881 (2024).
Article PubMed PubMed Central CAS Google Scholar
Lin, Y. et al. scclassify: sample size estimation and multiscale classification of cells using single and multiple reference. Mol. Syst. Biol. 16, e9389 (2020).
Article PubMed PubMed Central CAS Google Scholar
Lohoff, T. et al. Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nat. Biotechnol. 40, 74–85 (2022).
Article PubMed CAS Google Scholar
Johnson, J., Douze, M. & Jégou, H. Billion-scale similarity search with gpus. IEEE Trans. Big Data 7, 535–547 (2019).
Article Google Scholar
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
Article PubMed CAS Google Scholar
Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods 19, 171–178 (2022).
Article PubMed PubMed Central CAS Google Scholar
Kats, I., Vento-Tormo, R. & Stegle, O. Spatialde2: fast and localized variance component analysis of spatial transcriptomics. Biorxiv https://doi.org/10.1101/2021.10.27.466045 (2021).
Chen, C., Kim, H. J. & Yang, P. Evaluating spatially variable gene detection methods for spatial transcriptomics data. Genome Biol. 25, 18 (2024).
Article PubMed PubMed Central Google Scholar
Zhu, J., Sun, S. & Zhou, X. Spark-x: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies. Genome Biol. 22, 184 (2021).
Article PubMed PubMed Central CAS Google Scholar
Rydén, L. et al. Tumor specific VEGF-a and VEGFR2/kdr protein are co-expressed in breast cancer. Breast Cancer Res. Treat. 82, 147–154 (2003).
Article PubMed Google Scholar
Zhou, Z., Christofidou-Solomidou, M., Garlanda, C. & DeLisser, H. M. Antibody against murine PECAM-1 inhibits tumor angiogenesis in mice. Angiogenesis 3, 181–188 (1999).
Article PubMed CAS Google Scholar
Higuchi, K. et al. Cadherin 5 expression correlates with poor survival in human gastric cancer. J. Clin. Pathol. 70, 217–221 (2016).
Vieira, S. C. et al. Cd34 as a marker for evaluating angiogenesis in cervical cancer. Pathol.-Res. Pract. 201, 313–318 (2005).
Article PubMed CAS Google Scholar
La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
Article PubMed PubMed Central Google Scholar
Abdelaal, T. et al. Sirv: Spatial inference of RNA velocity at the single-cell resolution. NAR Genomics Bioinform. 6 (2024).
La Manno, G. et al. Molecular architecture of the developing mouse brain. Nature 596, 92–96 (2021).
Article PubMed Google Scholar
Cang, Z. et al. Screening cell–cell communication in spatial transcriptomics via collective optimal transport. Nat. Methods 20, 218–228 (2023).
Article PubMed PubMed Central CAS Google Scholar
Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).
Article PubMed CAS Google Scholar
Gayoso, A. et al. Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells. Nat. Methods 21, 50–59 (2023).
Jiang, J., Wang, C., Qi, R., Fu, H. & Ma, Q. Scread: a single-cell RNA-seq database for alzheimer’s disease. Iscience 23 (2020).
Zeng, H. et al. Integrative in situ mapping of single-cell transcriptional states and tissue histopathology in a mouse model of alzheimer’s disease. Nat. Neurosci. 26, 430–446 (2023).
Article PubMed PubMed Central CAS Google Scholar
Data set. Vizgen MERFISH FFPE Human Immuno-oncology Data Set (Vizgen, 2022).
Kim, N. et al. Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma. Nat. Commun. 11, 2285 (2020).
Article PubMed PubMed Central CAS Google Scholar
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).
Jin, X. et al. In vivo perturb-seq reveals neuronal and glial abnormalities associated with autism risk genes. Science 370, eaaz6063 (2020).
Article PubMed PubMed Central CAS Google Scholar
Qiu, X. et al. Mapping transcriptomic vector fields of single cells. Cell 185, 690–711 (2022).
Article PubMed PubMed Central CAS Google Scholar
Lee, T., Lee, A. & Luo, L. Development of the drosophila mushroom bodies: sequential generation of three distinct types of neurons from a neuroblast. Development 126, 4065–4076 (1999).
Article PubMed CAS Google Scholar
Van Straaten, H., Hekking, J., Thors, F., Wiertz-Hoessels, E. & Drukker, J. Induction of an additional floor plate in the neural tube. Acta Morphologica Neerlando-Scandinavica 23, 91–97 (1985).
PubMed Google Scholar
Yamada, T., Pfaff, S. L., Edlund, T. & Jessell, T. M. Control of cell pattern in the neural tube: motor neuron induction by diffusible factors from notochord and floor plate. Cell 73, 673–686 (1993).
Article PubMed CAS Google Scholar
Tsarovina, K. et al. Essential role of gata transcription factors in sympathetic neuron development. Development 131, 4775–4786 (2004).
Article PubMed CAS Google Scholar
Cao, H. et al. Association of spi1 haplotypes with altered spi1 gene expression and alzheimer’s disease risk. J. Alzheimer’s. Dis. 86, 1861–1873 (2022).
Article CAS Google Scholar
Tian, L., Chen, F. & Macosko, E. Z. The expanding vistas of spatial transcriptomics. Nat. Biotechnol. 41, 773–782 (2023).
Article PubMed CAS Google Scholar
Rao, A., Barkley, D., França, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 596, 211–220 (2021).
Article PubMed PubMed Central CAS Google Scholar
scvi tools. Github issue of gimvi https://github.com/scverse/scvi-tools/pull/2297 (2032).
Nguyen, K. et al. Improving mini-batch optimal transport via partial transportation. In International Conference on Machine Learning, 16656–16690 (PMLR, 2022).
Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).
Article PubMed PubMed Central CAS Google Scholar
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
Article PubMed PubMed Central CAS Google Scholar
Zhuang, F. et al. A comprehensive survey on transfer learning. Proc. IEEE 109, 43–76 (2020).
Article Google Scholar
Kingma, D. P. & Welling, M. Auto-encoding variational {Bayes}. In 2nd International Conference on Learning Representations, Conference Track Proceedings (2014).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations https://openreview.net/forum?id=rJXMpikCZ (2018).
Minoura, K., Abe, K., Nam, H., Nishikawa, H. & Shimamura, T. A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. Cell Rep. Methods 1 (2021).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition, 770–778 (IEEE, 2016).
Blitzer, J., Crammer, K., Kulesza, A., Pereira, F. & Wortman, J. Learning bounds for domain adaptation. Adv. Neural Inform. Process. Syst. 20 (2007).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech.: theory Exp. 2008, P10008 (2008).
Article Google Scholar
Jin, S. et al. Inference and analysis of cell-cell communication using cellchat. Nat. Commun. 12, 1088 (2021).
Article PubMed PubMed Central CAS Google Scholar
Liu, Z., Sun, D. & Wang, C. Evaluation of cell-cell interaction methods by integrating single-cell RNA sequencing data with spatial information. Genome Biol. 23, 1–38 (2022).
Article Google Scholar
Dong, M., Kluger, H., Fan, R. & Kluger, Y. Simvi reveals intrinsic and spatial-induced states in spatial omics data. bioRxiv https://doi.org/10.1101/2023.08.28.554970 (2023).
Tang, Z. et al. Search and match across spatial omics samples at single-cell resolution. Nat. Methods 21, 1818–1829 (2024).
Article PubMed PubMed Central CAS Google Scholar
HelloWorldLTY. Helloworldlty/vista: Release Zendo file. https://doi.org/10.5281/zenodo.17968354 (2025).

Download references

Acknowledgements

The authors appreciate the comments, feedback, and suggestions from Dr. Jia Zhao for uncertainty explanations, Mingze Dong for the use of SIMVI, and Dr. Romain Lopez for improvement of method description.

Author information

Authors and Affiliations

Interdepartmental Program in Computational Biology & Bioinformatics, Yale University, New Haven, 06511, CT, USA
Tianyu Liu & Hongyu Zhao
Department of Biostatistics, Yale University, New Haven, 06511, CT, USA
Yingxin Lin & Hongyu Zhao
Department of Statistics, University of Wisconsin-Madison, Madison, 53706, WI, USA
Xiao Luo
Department of Computer Science, University of California, Los Angeles, Los Angeles, 90095, CA, USA
Yizhou Sun

Authors

Tianyu Liu
View author publications
Search author on:PubMed Google Scholar
Yingxin Lin
View author publications
Search author on:PubMed Google Scholar
Xiao Luo
View author publications
Search author on:PubMed Google Scholar
Yizhou Sun
View author publications
Search author on:PubMed Google Scholar
Hongyu Zhao
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.L. and T.L. designed this study. T.L. and X.L. designed the model. T.L. ran all the experiments. T.L., Y.L., X.L., and H.Z. wrote the manuscript. Y.S. and H.Z. supervised this work.

Corresponding author

Correspondence to Hongyu Zhao.

Ethics declarations

Competing interests

This project is supported in part by NIH grant U24 HG012108 awarded to H.Z. Other authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Juexin Wang, Qianqian Song and Fei Wang for their contribution to the peer review of this work. Primary Handling Editor: George Inglis. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary Files (download DOCX )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, T., Lin, Y., Luo, X. et al. VISTA uncovers missing gene expression and spatial-induced information for spatial transcriptomic data analysis. Commun Biol 9, 203 (2026). https://doi.org/10.1038/s42003-025-09479-6

Download citation

Received: 08 September 2025
Accepted: 24 December 2025
Published: 08 January 2026
Version of record: 10 February 2026
DOI: https://doi.org/10.1038/s42003-025-09479-6