CellNavi predicts genes directing cellular transitions by learning a gene graph-enhanced cell state manifold

Wang, Tianze; Pan, Yan; Ju, Fusong; Zheng, Shuxin; Liu, Chang; Min, Yaosen; Jiang, Qun; Liu, Xinwei; Xia, Huanhuan; Liu, Guoqing; Liu, Haiguang; Deng, Pan

doi:10.1038/s41556-025-01755-1

Download PDF

Article
Open access
Published: 03 October 2025

CellNavi predicts genes directing cellular transitions by learning a gene graph-enhanced cell state manifold

Tianze Wang^1,2^na1,
Yan Pan^1,3^na1,
Fusong Ju¹^na1,
Shuxin Zheng¹^na1^nAff5,
Chang Liu¹^na1,
Yaosen Min¹^nAff5,
Qun Jiang^1,3,
Xinwei Liu^1,4,
Huanhuan Xia¹^nAff5,
Guoqing Liu¹,
Haiguang Liu¹ &
…
Pan Deng ORCID: orcid.org/0009-0008-7948-4939¹

Nature Cell Biology volume 27, pages 1863–1874 (2025)Cite this article

25k Accesses
3 Altmetric
Metrics details

Subjects

Abstract

A select few genes act as pivotal drivers in the process of cell state transitions. However, finding key genes involved in different transitions is challenging. Here, to address this problem, we present CellNavi, a deep learning-based framework designed to predict genes that drive cell state transitions. CellNavi builds a driver gene predictor upon a cell state manifold, which captures the intrinsic features of cells by learning from large-scale, high-dimensional transcriptomics data and integrating gene graphs with directional connections. Our analysis shows that CellNavi can accurately predict driver genes for transitions induced by genetic, chemical and cytokine perturbations across diverse cell types, conditions and studies. By leveraging a biologically meaningful cell state manifold, it is proficient in tasks involving critical transitions such as cellular differentiation, disease progression and drug response. CellNavi represents a substantial advancement in driver gene prediction and cell state manipulation, opening new avenues in disease biology and therapeutic discovery.

Deep model predictive control of gene expression in thousands of single cells

Article Open access 08 March 2024

Deciphering driver regulators of cell fate decisions from single-cell transcriptomics data with CEFCON

Article Open access 20 December 2023

Revealing invisible cell phenotypes with conditional generative modeling

Article Open access 11 October 2023

Main

Understanding the genetic drivers of cellular transitions is crucial for elucidating complex biological processes and disease mechanisms^1,2,3. However, identifying these drivers remains inherently challenging due to the vast number of genes involved in transitions and their complex interdependencies, contrasted with limited experimental capacity and incomplete biological knowledge. Therefore, in silico methods capable of predicting driver genes across diverse contexts are highly desirable.

Traditionally, efforts to pinpoint critical driver genes have primarily relied on network-based methodologies, with a particular focus on gene regulatory networks (GRNs)^4,5,6,7,8. Although GRN-centric approaches have made notable progress, they also encounter limitations that hinder their broader use. For example, deducing accurate GRNs within heterogeneous cell populations, which is more relevant to translational research, remains a challenge^9,10. Moreover, GRN models tend to prioritize transcription factors and may overlook non-transcriptional drivers of cellular transitions. This limits our understanding of complex cellular processes such as disease progression, immune modulation and pharmacological responses.

To this end, we developed CellNavi, a deep learning framework designed to predict driver genes and navigate cellular transitions. CellNavi constructs a driver gene predictor (DGP) on top of a learned manifold that parameterizes valid cell states. This manifold is modelled by mapping raw cell state representations onto a lower-dimensional coordinate space, where the dimensions correspond to intrinsic features of cell states, and the distance reflects the biological similarity between cells. To build this manifold, CellNavi is trained on large-scale, high-dimensional single-cell transcriptomic data, along with prior directional gene graphs that reveal the underlying structure of cell states. By projecting cellular data onto this biologically meaningful space with reduced dimensionality and enhanced biological relevance, CellNavi provides a universal framework that generalizes across diverse cellular contexts, allowing robust driver gene predictions even in previously unexplored cell types or conditions.

Our results show that CellNavi excels at predicting driver genes across a wide range of biological transitions, demonstrating strong performance in quantitative tasks curated in both immortalized cell lines and primary cells. It identifies crucial regulators in T cell differentiation and uncovers key genes associated with neurodegenerative diseases. Notably, CellNavi infers mechanisms of action for drug compounds without the need for drug-specific training, underscoring its potential in drug discovery. In summary, CellNavi offers a powerful framework for deciphering cell state transitions and their underlying mechanisms, holding profound promise for advancing cell biology and disease research.

Overview of CellNavi

CellNavi is designed to predict driver genes for given cellular transitions, where the transcriptomic data of the source and target cells represent the initial and final states of these transitions (Fig. 1a–c).

CellNavi comprises two main components: the cell manifold model (CMM), which captures and represents cell states, and the DGP, which identifies key genes driving these transitions based on learned cell representations (Fig. 1b).

The CMM is built to capture valid cell states across diverse biological contexts. While transcriptomes are often used to represent cell states, valid cell states do not span the entire high-dimensional transcriptomic space but instead form a lower-dimensional manifold (Fig. 1c). To model this, the CMM maps transcriptomic vectors to a lower-dimensional coordinate space that represents the intrinsic features of cell states, while preserving the relative similarities between cells (dimensionality considerations are discussed in Supplementary Note 1).

We first curated a dataset of approximately 20 million single-cell transcriptomic profiles sourced from the Human Cell Atlas (HCA)¹¹ (Fig. 1d) and adapted a transformer architecture based on attention mechanisms, known for its ability to discern complex patterns in large-scale data^{12,13,14,15,16,17}, to train the CMM (Fig. 1e). The training involved a self-supervised downsampling reconstruction task (Methods and Supplementary Note 2). To prioritize cell rather than gene-level representations, we developed a decoder module to reconstruct gene expression profiles from the cell coordinates—representations of cells within the coordinate space of the cell state manifold—generated by the CMM (Fig. 1e and Extended Data Fig. 1; Methods). This approach aligns cells across varying sequencing depths (Extended Data Fig. 2a) and recapitulates developmental trajectories from single cells (Extended Data Fig. 2b–d), indicating that it captures both intra- and intercellular features.

However, relying solely on transcriptomic data may overlook the intricate gene–gene interactions that are crucial for describing and distinguishing cell states. To address this, we incorporated 20 million cell-specific gene graphs into the CMM training process (Fig. 1d,e). These graphs encode directional connections derived from a prior network that spans over 30,000 human genes and their associated signalling pathways¹⁸ (Methods). More specifically, in these gene graphs, each edge represents a causal relationship between two genes, with the direction indicating the regulatory influence from one gene to the other (Methods). These graphs provide richer information about the complex dependencies among genes, which extend beyond simple transcriptomic data, hence better implying intrinsic variables spanning the valid cell space. To leverage these gene graphs, we replaced the standard transformer encoder layer in the CMM using a GeneGraph attention layer (Extended Data Fig. 1b). These layers, inspired by attention variants tailored for graph data¹⁹, can process gene networks, thus enabling the model to integrate critical gene–gene relationships. With these designs, the model is driven to cultivate a manifold that systematically represents cell states and effectively reflects the relationships between cells, forming an informative foundation for driver gene prediction.

Building upon this manifold, we developed the DGP to predict genes driving specified cellular transitions (Methods and Extended Data Fig. 3a). The DGP is trained on clustered regularly interspaced short palindromic repeats (CRISPR) screen data, which link genetic perturbations to consequent changes in cell states^{20,21,22,23,24,25}. We designated unperturbed controls and CRISPR-perturbed cells as source and target pairs, respectively, and utilized validated perturbed genes as labels for joint training (fine-tuning) of the CMM and DGP (Fig. 1f). Specifically, for each cell pair, their transcriptomic profiles are transformed into cell coordinates by the CMM, which are then processed by the DGP to generate a likelihood score vector indicating the probability that various candidate genes are orchestrating the transitions (Extended Data Fig. 3a).

We demonstrate that CellNavi, fine-tuned on CRISPR screen data—typically conducted on cultured cells or homogeneous populations and focusing on immediate genetic perturbations—can be extended to more complex transitions in heterogeneous tissues and primary cells (Fig. 1g and Extended Data Fig. 3b). By leveraging a biologically meaningful manifold, CellNavi generalizes knowledge gained from CRISPR screens beyond their original scope, to cellular transitions that are challenging to investigate using regular CRISPR methodologies. However, we acknowledge that CellNavi’s performance in specific contexts may benefit from additional fine-tuning on relevant CRISPR datasets. Incorporating expanded experimental data may further enhance its applicability across diverse biological settings with minimal adaptation.

Quantitative evaluation of CellNavi

To assess the capabilities of CellNavi, we first evaluated its performance on CRISPR perturbation datasets, where driver gene information is well established for transitions from source (unperturbed) to target (perturbed) cells.

We initially applied CellNavi to the Schmidt dataset, a CRISPR activation screen profiling 69 genetic perturbations²⁶. This dataset captures distinct expression profiles and molecular phenotypes across both resting and restimulated T cells, within and between different cell types, before and after perturbations (Extended Data Fig. 4a). We fine-tuned our model on restimulated T cells and tested it on resting T cells (Fig. 2a). This set-up allowed us to evaluate CellNavi’s ability to generalize across heterogeneous primary cells and predict driver genes in new cell states.

**Fig. 2: Quantitative assessment of CellNavi.**

For each source–target cell pair, CellNavi prioritizes candidate genes based on their predicted likelihood scores. Across 23,047 source–target cell pairs, CellNavi achieves a top-1 accuracy of 0.621 and a top-5 accuracy of 0.733 (Fig. 2b), while maintaining strong performance across additional metrics (Fig. 2b,c and Extended Data Fig. 4b). Interestingly, substantial variation in top-1 accuracy was observed across perturbed genes, independent of sample size (Fig. 2d). Correlation analysis between gene-wise performance and the Local Inverse Simpson’s Index (LISI)²⁷ suggests that CellNavi’s accuracy is influenced by the degree of perturbation heterogeneity: perturbations with low average LISI values, indicative of a more distinct and homogeneous response, were associated with higher accuracy (top-1 accuracy >0.8, Fig. 2e).

To demonstrate CellNavi’s effectiveness, we compared it with two alternative methods: SCENIC/SCENIC+^4,5, a training-free approach that infers GRNs from transcriptomic data with a focus on master regulators, and GEARS²⁸, an in silico perturbation approach, which targets a partially inverse problem of cellular transition prediction (Methods). Both SCENIC and GEARS exhibited markedly lower performance compared to CellNavi (Fig. 2b–d and Extended Data Fig. 4b). In addition, SCENIC, the network-based approaches, faced challenges in identifying regulons at the single-cell level (Extended Data Fig. 4c) and therefore struggled to make predictions in many cases. To investigate whether this is a broad challenge for GRN inference methods, we evaluated three alternative GRN inference approaches: GENIE3²⁹, GRNBoost2³⁰ and RENGE³¹ (Methods). These methods similarly exhibited poor performance in single-cell contexts (Supplementary Table 1).

CellNavi does not simply predict driver genes from expression changes. We conducted an ablation study by systematically removing the expression of perturbed genes from the input. Although this led to a decrease in performance, CellNavi still maintained substantial predictive accuracy, far surpassing expectations of random prediction (Extended Data Fig. 4d,e). In addition, DGE analysis revealed that the rankings of differentially expressed genes were poorly correlated with the actual perturbed genes (Fig. 2b–d and Extended Data Fig. 4b). These results suggest that CellNavi identifies driver genes beyond those detectable by expression shifts alone.

We further tested CellNavi on the Norman dataset³², which features a CRISPR interference screen on the K562 cell line. This dataset encompasses 105 single-gene and 131 gene pair perturbations, allowing us to assess CellNavi’s performance on transitions driven by both single and multiple genes. Using the unsupervised Leiden algorithm³³, we stratified the cells by cluster, holding out one cluster for testing and training on the remaining ones (Fig. 2a and Extended Data Fig. 4f). To ensure rigorous evaluation, we excluded all multigene perturbations from training.

CellNavi maintained strong performance on single driver gene prediction in the Norman dataset (Fig. 2f,g, Extended Data Fig. 4g and Supplementary Table 1). To evaluate multigene scenarios, we focused on the predicted rankings of perturbed genes. CellNavi ranked the first and second perturbed genes at averages of 7.9 and 31.2 out of 105 candidates, respectively, greatly outperforming all other tested methods (Fig. 2h).

Several recent studies have indicated that linear models can outperform deep learning methods in cell modelling tasks^34,35,36,37. To investigate this, we evaluated multiple linear models for driver gene prediction under various conditions. Our results showed that CellNavi consistently outperformed these linear models by a substantial margin across settings (Supplementary Note 3). Furthermore, we applied cross-validation to ensure robust and unbiased evaluation and found that CellNavi demonstrated consistently superior performance across these conditions (Supplementary Tables 2 and 3). Altogether, these results, spanning diverse datasets and metrics, highlight CellNavi’s strong capability to identify genes driving cellular changes, even in previously uncharacterized cell states.

Evaluating model components and graph configurations

To assess the contributions of the CMM and DGP components, and to evaluate whether pretraining with the CMM improves generalization across biological contexts, we designed two ablated methods. The first combined the DGP with raw gene expression vectors instead of outputs from the CMM (no-CMM). The second replaced the DGP with a simpler multinomial logistic regression model (no-DGP). In addition to the Norman single perturbation split, which utilizes a cluster-based holdout strategy (out-of-domain split), we curated an alternative evaluation approach using random holdout to simulate a scenario without generalization (in-domain split). Removing either CMM pretraining (no-CMM) or DGP fine-tuning (no-DGP) led to reduced performance; however, for out-of-domain split, the absence of CMM pretraining (no-CMM) caused a greater drop in performance compared to the in-domain split scenario (Extended Data Fig. 5). These results highlight that CMM pretraining is essential for generalization across biologically diverse contexts, while DGP fine-tuning further optimizes task-specific predictions.

We also evaluated the impact of the NicheNet gene graph on CellNavi’s predictions. Replacing NicheNet with GRNs inferred using GENIE3, GRNBoost2 or RENGE resulted in reduced performance (Extended Data Table 1), underscoring the advantage of integrating pathway-level information beyond GRNs, particularly in modelling perturbation-induced transitions. Furthermore, we tested graph configurations with varying levels of connectivity, including fully connected graphs, sparsified graphs with edges reduced to 1/10 or 1/20 of the original graph, and random graphs with the same sparsity as NicheNet (Methods). All alternative configurations led to further performance declines relative to biologically meaningful graphs constructed using diverse GRN inference methods (Extended Data Table 1). Collectively, these results emphasize the importance of leveraging biologically meaningful and comprehensive gene graphs, such as NicheNet, to ensure predictive robustness and accuracy.

CellNavi identifies key genes in T cell differentiation

We next applied CellNavi to the Cano-Gomez dataset³⁸, which profiled T cell differentiation by stimulating naive and memory CD4⁺ T cells in vitro with anti-CD3/anti-CD28 and cytokines. During this process, external signals, such as antigens and cytokines, activate key genes modulating genetic circuits and gene expression programs, allowing T cells to adopt specialized functions. We assessed whether CellNavi could identify such key genes underlying transitions.

For this dataset, we constructed source–target cell pairs using Th0 cells as the source and cytokine-induced cells as targets. As cells differentiated into various effector T cell subtypes after stimulation^{26,38,39,40,41,42,43}, we first compiled a comprehensive marker gene set and computed a ‘transition score’ to quantify differentiation into these subtypes for each cell. Notably, marker genes associated with IL-2^hi, IFNγ^hi and T helper 2 (T_H2) cells were strongly enriched (Extended Data Fig. 6), and transition scores towards these cell types demonstrated clear patterns (Fig. 3a–c and Methods). We then examined CellNavi’s ability to identify driver genes across these effector T cell groups. Corresponding cell pairs were input into a CellNavi model trained on the Schmidt dataset, which encompasses extensive immune-related gene programs. Finally, we curated a literature-based list of established driver genes for phenotypic transitions towards specific effector cell types^{26,42,43,44,45,46,47,48,49} (Supplementary Table 4) and evaluated CellNavi’s performance in prioritizing these genes.

**Fig. 3: CellNavi identifies key genes involved in T cell differentiation.**

CellNavi accurately ranked CD28 and VAV1, key drivers of IL-2^hi cells, as the top candidates in the IL-2^hi group defined by the transition score (Fig. 3d). Similarly, high rankings were observed for CD27 and IL9R in IFNγ-high cells, and GATA3 in T_H2 cells (Fig. 3d). We further analysed the average rankings of these established driver genes across the different effector cell groups. As expected, the relevant driver genes consistently ranked higher in their corresponding cell groups where they are known to drive differentiation. Notably, CD28, VAV1, CD27 and IL9R achieved average rankings of 2.6, 2.9, 5.3 and 8.5, respectively, in their associated cell groups, greatly outperforming their rankings in unrelated groups (Fig. 3e). These results demonstrate CellNavi’s effectiveness in identifying key genes that govern distinct differentiation pathways while distinguishing between cell fates. However, CD28’s dual role in IL-2 and IFNγ regulation was not fully captured by the model. In addition, although GATA3 ranked highly in Th2 cells, its average ranking was not as strong as expected. Upon further inspection of the T_H2 cluster, we observed that GATA3 was ranked first in an aggregated subset of cells, while its ranking was more dispersed across the entire T_H2 group (Fig. 3f), suggesting heterogeneity within the cluster.

Next, we examined the likelihood scores assigned by CellNavi to driver genes across different cell groups. For known driver genes, CellNavi consistently assigned higher likelihood scores within their corresponding cell groups compared to other groups (Fig. 3g), suggesting that these scores accurately prioritize key driver genes. In addition, the scores could be used to distinguish cell states undergoing specific transitions (Fig. 3h,i and Methods), offering an alternative approach for cell state characterization.

CellNavi predicts key genes during pathogenesis

We then investigated whether CellNavi could predict key genes involved in disease progression, using an in vitro model system of neurodegenerative diseases, specifically the Fernandes dataset⁵⁰. This system comprises induced pluripotent stem (iPS) cell-derived dopaminergic neurons subjected to tunicamycin treatment. Tunicamycin induces endoplasmic reticulum (ER) stress and Parkinson’s disease (PD)-like symptoms by inhibiting N-linked glycosylation⁵¹, a process that affects a broad spectrum of proteins post-translationally, without perturbing any single gene directly.

Before this analysis, CellNavi was trained on single-cell CRISPR screen data on iPS cell-derived neurons from a different study, the Tian dataset⁵². While both studies investigate neurodegenerative diseases using human iPS cell-derived neurons, they differ in the source of iPS cells and the differentiation protocols, resulting in the generation of distinct neuron types^50,52,53 (Fig. 4a).

**Fig. 4: CellNavi predicts key genes involved in neurodegenerative pathogenesis.**

After training, we input approximately 47,000 source–target cell pairs from the Fernandes dataset into CellNavi, using untreated cells as sources and cells exposed to tunicamycin as targets. We asked CellNavi to prioritize 184 candidate genes, including 5 known ER stress response genes. CellNavi successfully pinpointed EIF2S1, BAX and HSPA5, which achieved median rankings of 3, 7 and 16, respectively, among the candidate genes (Fig. 4b). However, HYOU1 and VCP ranked lower. One possible explanation is that these genes play more nuanced roles in the ER stress response or are involved in pathways not prominently activated under the specific experimental conditions of this study.

We next examined the top 20 predicted genes for each cell pair. While a total of 31 genes were significantly enriched (Fig. 4c), FAM57B, EIF2S1, NDUFS8, BAX and CYCS consistently ranked highest across the majority of cells. Notably, EIF2S1 and BAX are well-established ER stress regulators, while NDUFS8 and CYCS are linked to mitochondrial stress, which is often closely associated with ER stress⁵⁴. In parallel, Fernandes et al. previously identified six subtypes of iPS cell-derived neurons from transcriptomic data and our top 20 predictions revealed subtype-specific gene preferences. For instance, our model suggests that FARP1, CELF1, HYOU1 and APEX1 may play more critical roles in progenitor cells (Fig. 4c). Lastly, except for HSPA5 and HYOU1, most predicted genes showed modest expression changes (Fig. 4d and Extended Data Fig. 7), consistent with previous observations that CellNavi identifies key regulators beyond those detectable by expression shifts alone.

CellNavi reveals mechanisms of action for drug compounds

Understanding the mechanisms of action of novel drug candidates may enhance drug safety and efficacy, reduce development costs and accelerate drug discovery process. However, conventional drug screening paradigms often fall short in elucidating the cellular-level effects that drive biological functions and therapeutic outcomes.

Here, we applied CellNavi to predict key genes modulated by histone deacetylase (HDAC) inhibitors, a class of antitumour drugs with promising therapeutic potential in cancer treatment⁵⁵. HDACs are enzymes integral to post-translational protein modifications and interact with various oncogenic pathways to promote tumour progression^56,57. The intricate downstream pathways influenced by HDAC presents a considerable challenge in fully understanding mechanisms through which HDAC inhibitors exert their effects within cells.

For this purpose, we applied CellNavi to a chemical screen that quantified the transcriptomic response of K562 cells to 17 distinct HDAC inhibitors (referred to as the Srivastan dataset)⁵⁸. In this set-up, vehicle-treated cells were designated as sources, while cells exposed to the HDAC inhibitors served as targets. The predicted likelihood score indicated whether a gene was modulated during drug treatment, with higher scores suggesting a more prominent role during treatment with specific HDAC inhibitors. Notably, CellNavi was trained exclusively on genetic perturbations²⁵.

While the transcriptomic data depicted a mixed response across the inhibitors (Extended Data Fig. 8a), the likelihood score vectors effectively clustered the inhibitors into distinct clusters (Fig. 5a,b and Extended Data Fig. 8b). Further analysis revealed diversity in the top-ranked driver genes (Fig. 5c). Specifically, cells treated with mocetinostat, tucidinostat, entinostat and tacedinaline (grouped in cluster 3) exhibited high scores for mitochondrial-related genes such as MRPS31 and NDUFB7. By contrast, most other compounds prioritized genes related to RNA splicing and transcription regulation, such as PRPF3 and POLR2A.

**Fig. 5: CellNavi reveals diverse downstream gene programs affected by HDAC inhibitors.**

Gene Ontology (GO) enrichment analysis of the top 50 genes predicted for each inhibitor revealed a consistent pattern (Fig. 5d and Supplementary Fig. 1): compounds in cluster 3 were enriched for genes involved in biosynthetic processes, mitochondrial function and protein metabolism, whereas compounds in other clusters were enriched in gene programs related to RNA splicing, processing and metabolism. These findings align with the known effect of deacetylation inhibition, which lowers cytoplasmic acetate levels and alters acetyl-CoA concentrations, a key metabolite involved in cellular metabolism⁵⁸. Moreover, the results suggest that certain HDAC inhibitors may preferentially target chromatin regions regulating RNA processing genes, which are crucial for tumour cell proliferation^59,60,61 (Fig. 5e).

Intriguingly, we observed a correlation between the selectivity of downstream gene programs and the half-maximal inhibitory concentration (IC₅₀) values reported in the literature⁵⁸ (Fig. 5f). Specifically, compounds with lower IC₅₀ values tend to influence RNA-related pathways, whereas those with higher IC₅₀ values were associated with mitochondrial functions. To further explore the molecular basis of this divergence, we examined the interactions between human HDAC2 and either panobinostat (enriched for RNA-related genes) or tucidinostat (enriched for mitochondrial-related genes). Although molecular docking revealed no major differences in their potential interactions with the zinc-dependent HDAC protein, the aniline group in tucidinostat allowed it to embed more deeply into the HDAC2 pocket (Fig. 5g). Interestingly, all four compounds in cluster 3 shared similar warheads, a feature absent in other compounds (Fig. 5g and Supplementary Fig. 2). This structural feature introduces a steric effect that may influence the efficacy of compounds⁶² and lead to divergent downstream response, a phenomenon known as functional selectivity^63,64,65,66. However, the mitochondrial preference and lower potency of compounds like tucidinostat may also result from higher lipophilicity, which can promote off-target or non-specific effects. Nonetheless, these findings highlight CellNavi’s potential to elucidate the intricate mechanisms of action underlying drug interventions, highlighting an approach to optimize drug efficacy and specificity for targets involving complex downstream signalling pathways.

CellNavi generalizes to novel cell types

Lastly, we evaluated the generalization capability of CellNavi. We focused on a CRISPR interference screen across HEK293FT and K562 cell lines⁶⁷. The cell types are markedly different in origin and characteristics—HEK293FT cells are derived from human embryonic kidney cells, while K562 cells are derived from human chronic myelogenous leukaemia (Fig. 6a). In this experiment, CellNavi was trained on HEK293FT cells, with all K562 cells held out as the test set (Methods).

**Fig. 6: CellNavi predicts driver genes in novel cell types.**

For the 16 perturbations targeting the cleavage and polyadenylation regulatory machinery (Fig. 6a), CellNavi achieved a macro F1 score of 0.432 on top-1 predictions (Fig. 6b). The model misclassified some genes encoding components of the CPSF and CSTF complexes, probably due to their similar post-perturbation transcriptomic profiles (Fig. 6c). However, the model performed well in predicting CPSF6 and NUDT21, which exhibit highly similar transcriptomic profiles after perturbation. Interestingly, despite distinct post-perturbation transcriptomic profiles for RPRD1A and RPRD1B perturbations, the model confused these genes in many cases. As the protein products of these genes form heterodimers to dephosphorylate the RNA polymerase II C-terminal domain⁶⁸, the model may be prioritizing functional interactions and shared pathways over expression differences, leading to the misinterpretation of these genes.

By comparing the similarities between cell groups stratified by true versus predicted perturbations, we found that both intra- and interperturbation correlations for predicted labels closely mirrored those of the true labels (Fig. 6c,d and Extended Data Fig. 9). This suggests that cells grouped by predicted perturbations exhibit gene expression signatures highly similar to those grouped by true perturbations. Although prediction accuracy may partly benefit from conserved perturbation effects across cell types, CellNavi remains effective even when applied to cell types markedly different from those used in training, demonstrating robust generalization across diverse cellular contexts.

Discussion

Understanding the regulatory mechanisms that govern cell identity and transitions stand a central challenge in cell biology^{6,69,70,71,72}. In this study, we introduce CellNavi, a deep learning framework designed to identify driver genes—key factors that orchestrate complex cellular transitions—across diverse biological contexts. By modelling cell states on a biologically informed manifold constructed from large-scale single-cell transcriptomic data and gene graph priors, CellNavi achieves accurate and generalizable predictions across multiple tasks and datasets.

Describing cell states on a manifold that captures their biological dimensions has been a long-lasting endeavour^{32,73,74,75,76}. Here, we utilized a structured gene graph derived from NicheNet to facilitate cell state manifold learning via deep neural networks. NicheNet is a comprehensive gene–gene graph integrating both GRNs and intercellular signalling pathways. This prior improved the accuracy for driver gene prediction compared with alternative or randomized graphs (Extended Data Table 1). Also, integrating prior gene graphs allowed CellNavi to place greater emphasis on transcription factors, which are crucial for defining cell states and orchestrating transitions^3,10,69 (Supplementary Figs. 3 and 4). This explicit focus on regulatory elements provides CellNavi with a distinct advantage to model complex biological processes and highlights the value of graph-based learning in improving model interpretability and biological relevance. However, we caution that attention mechanisms do not equate to mechanistic interpretability. The explainability remains a critical challenge for deep learning models, including CellNavi. Future work should develop tools to visualize and interpret how graph structures and attention dynamics shape predictions of driver genes.

Our construction of cell-type-specific graphs involves removing edges for genes with zero expression, based on a simplified assumption that such genes are unlikely to participate in active regulation. Consistent with the previous practices in single-cell foundation models^12,13,17 and cell-type-specific protein representation⁷⁷ learning, we expect this filtering to help reduce noise and highlight biologically relevant interactions. Yet, we recognize that zero expression values may also stem from technical artifacts such as dropout or low sequencing depth, rather than true biological absence. Future studies should assess alternative strategies, such as imputation or single-cell-level network construction⁷⁸, to balance denoising and information retention.

Inherent noise in biological data presents a substantial challenge for modelling. To mitigate technical variability, such as dropout events and differences in sequencing depth, we used a downsampling recovery pretraining strategy with a mixed downsampling rate. This strategy aligns input data of varying depths and improves robustness in handling real-world datasets. Additional noise arises from variability in CRISPR perturbation efficiency, including fluctuating perturbation success rates and off-target effects caused by intrinsic cellular stochasticity. Although CRISPR screens provide a rich and diverse dataset for CellNavi training, this noise may lead to inconsistent labels and biased learning. To mitigate this, future efforts could pool data from multiple batches, sources and single guide RNAs to reduce biases associated with specific experimental conditions. In addition, integrating orthogonal perturbation data, such as chemical treatments, could complement CRISPR-based data and further enhance model robustness.

CellNavi represents a pioneering effort to benchmark the performance and generalization capacity of deep learning methods on driver gene identification task. While the results are promising, several limitations remain. First, the current pipeline requires fine-tuning on single-cell CRISPR screen data relevant to the system of interest. While our proof-of-concept test involving HEK293FT and K562 cells demonstrated promising results (Fig. 6), the extent to which CellNavi can generalize to entirely new cell types or experimental systems remains unclear. Addressing this will require testing across more diverse contexts and quantifying the ‘distance’ between systems to determine when fine-tuning is necessary. A long-term goal is to reduce the dependence on such datasets by developing models that generalize with minimal experimental effort.

Second, CellNavi cannot yet generalize to novel genes, which limits its broader applicability. Expanding this capacity would require capturing gene networks and representations that enable extrapolation beyond the training dataset. While single-cell CRISPR experiments encompassing a broader range of target genes and cell types are desirable, integrating generative models to infer missing relationships could further improve the model’s capacity to handle novel genes.

Third, CellNavi lacks the ability to accurately model long-range transitions owing to its reliance on CRISPR perturbations and static snapshots of transcriptomic data. Many biological processes, such as differentiation and disease progression, unfold gradually through transient states not captured in steady-state data. Incorporating time-resolved single-cell data measurements could help construct dynamic manifolds that better reflect these processes.

Despite these challenges, CellNavi marks a major advance in modelling cell state transitions and identifying their genetic drivers. By combining biologically informed priors with advanced deep learning techniques, CellNavi achieves high accuracy and generalizability in diverse biological contexts. As we continue to refine and expand models like CellNavi, we are paving the way for novel treatments targeting the root causes of diseases with unprecedented specificity.

Methods

Input embeddings

In CellNavi, we use single-cell raw count matrices as the only input. Specifically, the single-cell sequencing data are processed into a cell-by-gene count matrix, ${\bf{X}}\in {{\mathbb{R}}}^{N\times G}$, where each element ${{\bf{X}}}_{n,g}$ represents the expression of the nth cell and the gth gene (or read count of the gth RNA).

To better describe a gene’s state in a cell, we involve both gene name and gene expression information in its input embeddings. Formally, the input embedding of a token is the concatenation of gene name embedding and gene expression embedding.

Gene name embedding

We use a learnable gene name embedding in CellNavi. The vocabulary of genes is obtained by taking the union set of gene names among all datasets. Then, the integer identifier of each gene in the vocabulary is fed into an embedding layer to obtain its gene name embedding. In addition, we incorporate a special token ${\rm{CLS}}$ in the vocabulary for aggregating all genes into a cell representation. The gene name embedding of cell $n$ can be represented as ${{\bf{h}}}_{n}^{({\rm{name}})}\in {{\mathbb{R}}}^{(G+1)\times H}$:

$$\,{{\bf{h}}}_{n}^{({\rm{name}})}=\left[{{\bf{h}}}_{n,{\rm{CLS}}}^{\left({\rm{name}}\right)},\,{{\bf{h}}}_{n,\,1}^{\left({\rm{name}}\right)},\,{{\bf{h}}}_{n,\,2}^{\left({\rm{name}}\right)},\ldots ,\,{{\bf{h}}}_{n,G}^{\left({\rm{name}}\right)}\right],$$

where $H$ is the dimension of embeddings, which is set to 256.

Gene expression embedding

One major challenge in modelling gene expression is the variability in absolute magnitudes across different sequencing protocols¹³. We tackled this challenge by normalizing the raw count expression for each cell using the shifted logarithm, which is defined as

$${\widetilde{{\bf{X}}}}_{n,g}=\log \left(L\frac{{{\bf{X}}}_{n,g}}{\sum _{{g}^{{\prime} }}{{\bf{X}}}_{n,{g}^{{\prime} }}}+1\right),$$

where ${{\bf{X}}}_{n,g}$ is the raw count of gene $g$ in cell $n$, $L$ is a scaling factor and we used a fixed value ${L}={1\times 10}^{4}$ in this study, and ${\widetilde{{\bf{X}}}}_{n,g}$ denotes the normalized count. Finally, a linear layer was applied on the normalized expression ${\widetilde{{\bf{X}}}}_{n,g}$ to obtain the gene expression embedding. For the ${\rm{CLS}}$ token, we set it as a unique value for gene expression embedding. The gene expression embedding of cell $n$ can be represented as ${{\bf{h}}}_{n}^{({\rm{expr}})}\in {{\mathbb{R}}}^{(G+1)\times H}$:

$${{\bf{h}}}_{n}^{({\rm{expr}})}=\left[{{\bf{h}}}_{n,{\rm{CLS}}}^{\left({\rm{expr}}\right)},\,{{\bf{h}}}_{n,1}^{\left({\rm{expr}}\right)},\,{{\bf{h}}}_{n,2}^{\left({\rm{expr}}\right)},\ldots ,\,{{\bf{h}}}_{n,G}^{\left({\rm{expr}}\right)}\right].$$

The final embedding of cell $n$ is defined as the concatenation of ${{\bf{h}}}_{n}^{({\rm{name}})}$ and ${{\bf{h}}}_{n}^{({\rm{expr}})}$:

$${{\bf{h}}}_{n}={\rm{SUM}}\left({{\bf{h}}}_{n}^{\left({\rm{name}}\right)},\,{{\bf{h}}}_{n}^{\left({\rm{expr}}\right)}\right)\in {{\mathbb{R}}}^{\left(G+1\right)\times H}.$$

Cell manifold model

Model architecture

The CMM, is composed of six layers of a transformer variant that is designed specifically for processing graph-structured data (GeneGraph attention layers)¹⁹. The encoder takes the input embeddings to generate cell representations and uses only genes with non-zero expressions. To further speed up training, also as an approach of data augmentation, we performed a gene sampling strategy by randomly selecting at most 2,048 genes as input. It should be noted that the strategy is applied only during training; all non-zero genes are included at inference stage to avoid information loss. We use ${{\bf{h}}}_{n}^{(l)}$ to represent the embedding of cell n at the lth layer, where ${{\bf{h}}}_{n}^{(l)}$ is defined as

$${{\bf{h}}}_{n}^{(l)}=\left\{\begin{array}{c}{{\bf{h}}}_{n},\,l=0,\\ {\rm{GeneGraphAttnLayer}}\left({{\bf{h}}}_{n}^{\left(l-1\right)}\right),l\in \left[1,6\right].\end{array}\right.$$

The multihead attention module in each GeneGraph attention layer consists of three components. In addition to a self-attention module, a centrality encoding module and a spatial encoding module are also incorporated to modify the standard self-attention module for graph data integration.

We start by introducing the standard self-attention module. Let ${N}_{{\rm{heads}}}$ be the number of heads in the self-attention module. In the lth layer, ith head, self-attention is calculated as

$$\begin{array}{c}{{\bf{Q}}}_{n}^{(l,i)}={{\bf{h}}}_{n}^{(l)}{{\bf{W}}}^{({\rm{qry}},i)},{{\bf{K}}}_{n}^{(l,i)}={{\bf{h}}}_{n}^{(l)}{{\bf{W}}}^{({\rm{key}},i)},{{\bf{V}}}_{n}^{(l,i)}={{\bf{h}}}_{n}^{(l)}{{\bf{W}}}^{({\rm{val}},i)},\\ {{\bf{A}}}_{n}^{(l,i)}=\frac{{{\bf{Q}}}_{n}^{(l,i)}{\left({{\bf{K}}}_{n}^{(l,i)}\right)}^{\top }}{\sqrt{D}},{\rm{Attn}}\frac{(l,i)}{n}={\rm{softmax}}\left({{\bf{A}}}_{n}^{(l,i)}\right){{\bf{V}}}_{n}^{(l,i)},\\ {{\bf{h}}}_{n}^{(l){\prime} }={\rm{CONCAT}}\left({{\rm{Attn}}}_{n}^{(l,1)},\cdots ,{{\rm{Attn}}}_{n}^{(l,{N}_{{\rm{heads}}})}\right){{\bf{W}}}^{({\rm{out}})}\in {{\mathbb{R}}}^{(G+1)\times 2H},\end{array}$$

where ${{\bf{W}}}^{\left({\rm{qry}},{i}\right)}$, ${{\bf{W}}}^{\left({\rm{key}},i\right)}$ and ${{\bf{W}}}^{\left({\rm{val}},i\right)}\in {{\mathbb{R}}}^{2H\times D}$ are learnable matrices that project input embedding ${{\bf{h}}}_{n}^{\left(l\right)}$ of cell $n$ in to ${{\bf{Q}}}_{n}^{\left(l,i\right)}$,$\,{{\bf{K}}}_{n}^{\left(l,i\right)}$ and ${{\bf{V}}}_{n}^{\left(l,i\right)}$, the symbol ${{\bf{W}}}^{({\rm{out}})}\in {{\mathbb{R}}}^{\left(D{N}_{{\rm{heads}}}\right)\times 2H}$ is a learnable linear projection that refines the output of multihead attention, and D is the feature dimension for each attention head that satisfies DN_heads = 2H. The output of multihead attention ${{{\bf{h}}}_{{\rm{n}}}^{\left(l\right)}}^{{\prime} }$ is then passed through a layer normalization layer and a multilayer perceptron (MLP) model, producing the final output ${{\bf{h}}}_{n}^{(l+1)}$ as the input to the next layer.

The standard attention mechanism processes features of each individual gene independently, whereas the gene graph incorporates relational information between genes. To incorporate the gene graph information into the model, the centrality encoding module projects the relational information into the regulatory activity feature of each single gene, and the spatial encoding module directly incorporates the relational information with the attention mechanism. More specifically, we define ${{\bf{z}}}_{{{\rm{deg}}}^{-}({\mathcal{G}},g)}^{-}$ and ${{\bf{z}}}_{{{\rm{deg}}}^{+}({\mathcal{G}},g)}^{+}$, learnable embeddings describing in-degree deg⁻ and out-degree deg⁺ of gene g on the gene graph ${\mathcal{G}}$. We add these embeddings to the gene embeddings to update cell encoding:

$${{\bf{h}}}_{n,g}^{(l)}={{\bf{h}}}_{n,g}^{(l){\prime}}+{{\bf{z}}}_{{{\rm{deg}}}^{-}({\mathcal{G}},g)}^{-}+{{\bf{z}}}_{{{\rm{deg}}}^{+}({\mathcal{G}},g)}^{+}.$$

This cell encoding update by the centrality encoding module is applied before the self-attention module.

The spatial encoding module aims to capture regulation relations between genes from the gene graph. For this purpose, we generate the distance matrix ${\bf{S}}\in {{\mathbb{N}}}^{G\times G}$, which contains the shortest distances between gene pairs on the gene graph ${\mathscr{G}}$. We assign each element in ${\bf{S}}$ as a learnable bias added to attention weights:

$${{\bf{A}}}_{{g}_{1},{g}_{2}}^{{\prime} }={{\bf{A}}}_{{g}_{1},{g}_{2}}+b\left({{\bf{S}}}_{{g}_{1},{g}_{2}}\right),$$

where $b$ is a learnable scalar-valued function of the distance $\,{{\bf{S}}}_{{g}_{1},{g}_{2}}$. It assigns a special value to genes that are not connected to the graph. We use ${{\bf{A}}}^{{\prime} }$ in place of the original attention weights ${\bf{A}}$ in the standard self-attention module when computing self-attention in our model. In our implementation, we apply layer normalization and an MLP before computing multihead self-attention. The cell representation output from the CMM, ${{\bf{h}}}_{n,{\rm{CLS}}}^{(6)}$, is subsequently passed through a fully connected layer, where the dimensionality is increased from 256 to 2,048. This resulting value serves as the cell coordinate for cell $n$, denoted as ${{\bf{CRD}}}_{n}$.

CMM pretraining task

The CMM is expected to generate cell coordinates that parameterize the intrinsic features and variables (that are much less than the dimensions in the raw gene expression profile representation) of a cell state and maintain cell similarity in the vector space, to provide a concise and biologically relevant representation for the DGP to consume. To achieve this, we design a downsampling reconstruction pretraining task, which asks the CMM to produce a cell coordinates of a downsampled gene expression ${{\bf{X}}}_{n}^{\left({\rm{ds}}\right)}$ of a cell $n$, that allows a separate decoder model to reconstruct the original gene expression ${{\bf{X}}}_{n}$ of that cell as accurate as possible. To achieve this, the CMM is enforced to capture the co-varying patterns among the raw gene expression dimensions, hence helping the CMM to extract the underlying intrinsic variables.

Specifically, for the downsampling process, we downsample the raw count expression of each gene via a binomial distribution. The downsampled expression ${{\bf{X}}}_{n,g}^{({\rm{ds}})}$ of the nth cell and the gth gene is produced by

$${{\bf{X}}}_{n,g}^{\left({\rm{ds}}\right)}\sim B\left({{\bf{X}}}_{n,g},\frac{1}{{r}^{\;\left({\rm{ds}}\right)}}\right),$$

where the ∼ denotes ‘is distributed as’, ${{\bf{X}}}_{n,g}$ is the raw count of gene $g$ in cell $n$, ${r}^{\;\left({\rm{ds}}\right)}$ is the downsample rate that is uniformly sampled from $[1,\,20)$, and $B$ denotes the binomial distribution. The decoder is an MLP consisting of two linear layers. For each downsampled gene expression, the decoder concatenates the cell coordinates ${{\bf{CRD}}}_{n}$ of ${{\bf{X}}}_{n}^{\left({\rm{ds}}\right)}$ produced by the CMM and the embedding of that gene as the direct input to the MLP. The MLP output comes in the same shape as ${{\bf{X}}}_{n}$.

The learning objective for reconstructing the original gene expression profile ${{\bf{X}}}_{n}$ from the downsampled version ${{\bf{X}}}_{n}^{\left({\rm{ds}}\right)}$ is

$${ {\mathcal L} }_{{\rm{recons}}}=\frac{1}{N}\mathop{\sum }\limits_{n=1}^{N}{\Vert {\rm{DEC}}\left({\bf{C}}{\bf{R}}{\bf{D}}\left({{\bf{X}}}_{n}^{({\rm{ds}})}\right)\right)-{{\bf{X}}}_{n}\Vert }^{2},$$

where $\,{\|\cdot \|}^{2}$ represents the squared 2-norm of a vector. Both the CMM and the decoder are optimized. After pretraining, the CMM is to be used for driver gene prediction, while the decoder is discarded.

Driver gene predictor

The driver gene classifier is an MLP consisting of two linear layers. It is optimized to predict the perturbed genes from a pair of cell coordinates output by the CMM. To be more specific, transcriptomes of source cell ${{\bf{X}}}_{{\rm{src}}}$ and target cell ${{\bf{X}}}_{{\rm{tgt}}}$ are mapped to cell coordinates ${{\bf{CRD}}}_{{\rm{src}}}$ and ${{\bf{CRD}}}_{{\rm{tgt}}}$ with the CMM. For the direct input features, the DGP concatenates the two cell coordinates and then proceeds with an MLP, which outputs the logits of genes. We use the cross-entropy loss for training the DGP:

$${{\mathscr{L}}}_{{\rm{driver\_gene}}}={\rm{CE}}\left({\rm{DGP}}\left({\rm{CONCAT}}\left({\bf{CRD}}\left({{\bf{X}}}_{{\rm{src}}}\right),{\bf{CRD}}\left({{\bf{X}}}_{{\rm{tgt}}}\right)\right)\right),{g}_{{\rm{drv}}}\right),$$

where ${\rm{CE}}\left({\bf{l}},g\right)=\frac{{{\bf{l}}}_{g}}{\log {\sum }_{{g}^{{\prime} }}\exp \left({{\bf{l}}}_{{g}^{{\prime} }}\right)}$ is the cross-entropy loss, and ${g}_{{\rm{drv}}}$ denotes the driver gene corresponding to ${{\bf{X}}}_{{\rm{src}}}$ and ${{\bf{X}}}_{{\rm{tgt}}}$. The loss is finally averaged over all $\left({{\bf{X}}}_{{\rm{src}}},{{\bf{X}}}_{{\rm{tgt}}},{g}_{{\rm{drv}}}\right)$ tuples in the dataset. The pretrained CMM used to produce ${{\bf{CRD}}}_{{\rm{src}}}$ and ${{\bf{CRD}}}_{{\rm{tgt}}}$ is also fine-tuned together with the DGP by this loss.

Additional training details for CellNavi are available in Supplementary Note 4.

Baselines

SCENIC and SCENIC+

For each test dataset, SCENIC+ inferred a GRN, identified regulons ${{\bf{W}}}_{r}\in {{\mathbb{R}}}^{{N}_{r}\times {N}_{g}}$, and computed regulon activity ${{\bf{W}}}_{a}\in {{\mathbb{R}}}^{{N}_{c}\times {N}_{r}}$ in the cells, where ${N}_{r},\,{N}_{g}\,\text{and}\,{N}_{c}$ represent the number of identified regulons, genes and cells in the test dataset, respectively. ${{\bf{W}}}_{r}$ is a learnt matrix containing the weights of genes for different regulons, and ${{\bf{W}}}_{a}$ indicates the regulon activities for each cell. Then, we used ${{\bf{W}}}_{g}={{\bf{W}}}_{a}{{\bf{W}}}_{r}$ to represent the regulatory importance of each gene in cells. Based on these values (elements in ${{\bf{W}}}_{g}$), genes in each cell were ranked, with higher values indicating a greater potential role in controlling cellular identity. We applied SCENIC+ to Norman et al. and Schmidt et al. datasets. Only genes present in the perturbation pools of these datasets were included in the ranking based on ${W}_{g}$. Hyperparameters of GRN inference, regulon identification and regulon activation were set to default. Cells with no regulon activated were removed from our analysis. SCENIC+ analysis was realized by pyscenic 0.12.1.

Other GRNs

We constructed GRNs using three alternative methods: GRNBoost2, GENIE3 and RENGE, following default parameters from prior studies where applicable. Due to computational memory constraints, we limited the analysis for GENIE3 and RENGE to the top 5,000 highly variable genes. For GENIE3 and GRNBoost2, we utilized the SCENIC implementation to infer GRNs. For RENGE, which is designed to infer GRNs using time-series single-cell RNA sequencing (RNA-seq) data, we adapted the method to work with static single-cell RNA-seq data. After constructing GRNs with these methods, we applied the same downstream analysis protocol as described for the SCENIC pipeline.

In silico perturbation

In silico perturbation methods, such as GEARS, are capable of predicting transcriptomic outcomes of genetic perturbations. We trained GEARS model on the datasets mentioned in the corresponding tasks. For evaluation, we computed the cosine similarity between the predicted transcriptomic profiles under various perturbations and the corresponding profiles of cells from the test datasets. Driver genes were predicted on the basis of the similarities, and high values in similarity indicate the potential to be driver genes. GEARS analysis was realized by cell-gears 0.1.1. Data processing and training followed the data processing tutorial (https://github.com/snap-stanford/GEARS/blob/master/demo/data_tutorial.ipynb) and training tutorials (https://github.com/snap-stanford/GEARS/blob/master/demo/model_tutorial.ipynb).

DGE analysis

DGE analysis is the most frequently used method to reveal cell-type-specific transcriptomic signature. Initially, cells from the test datasets were normalized and subjected to logarithmic transformation. Subsequently, we applied the Leiden algorithm, an unsupervised clustering method, to categorize the target cells into distinct groups. The number of clusters for each test dataset was set to range from 20 to 40, ensuring that cellular heterogeneity was maintained while providing a sufficient number of cells in each group for robust statistical analysis. We selected source cells to serve as a reference for comparison and performed DGE analysis on each target cell group against this reference. The Wilcoxon signed-rank test was used to determine statistical significance. Then, significant genes were ranked according to their log-fold changes in expression as potential driver genes. Both unsupervised clustering and DGE analysis were conducted using the package scanpy 1.9.6.

The prior gene graph

The prior gene graph was constructed from NicheNet, where GRN and cellular signalling network were integrated. The gene graph is a directional graph. More specifically, for each gene node on the graph, the number of incoming edges corresponds to the genes that regulate it, while the number of outgoing edges represents the genes it regulates. In our approach, a connection was established between two genes if they were linked in either of the individual networks. The resulting integrated graph features 33,354 genes, each represented by a unique human gene symbol, and includes 8,452,360 edges that signify the potential interactions. The unweighted versions of NicheNet networks were used in our approach. For each cell, we remove the gene nodes with values of 0 in the raw count matrix of the single-cell transcriptomic profile, to construct the cell-type-specific gene graph. During pretraining, when downsampling is performed on single-cell transcriptomes, only the non-zero genes included as model input are retained to generate sample-specific graphs that guide the model’s task.

To evaluate the impact of graph connectivity and structure, we generated alternative graph configurations as follows:

(1)
Fully connected graph: A maximally connected graph where every pair of genes is connected by an edge of equal weight.
(2)
Sparsified graphs: Graphs were created by downsampling the total number of edges from the original graph to 1/10 and 1/20 of its total edges, enabling an evaluation of how reduced connectivity affects performance.
(3)
Random graphs: Randomized graphs were generated while preserving the number of nodes and certain structural properties of the original graph, such as self-loops. Edges were introduced probabilistically to maintain overall consistency with the original graph’s sparsity and connectivity.

Datasets

Human Cell Atlas

We downloaded all single-cell and single-nucleus datasets sourced from contributors or DCP/2 analysis in Homo sapiens up to March 2023, accumulating approximately 1.5 TB of raw data. We retained all experiments that included raw count matrices and standardized the variables to gene names using a mapping list obtained from Ensembl (https://www.ensembl.org/biomart/martview/574df5074dc07f2ee092b52c276ca4fc).

Norman et al

This dataset (GSE133344) measures transcriptomic consequences of CRISPR-mediated gene activation perturbations in K562 cell line. We filtered this dataset by removing cells with a total count below 3,500. After filtering, this dataset contained 105 perturbations targeting different genes, and 131 double perturbations targeting two genes simultaneously. We used unperturbed cells (with non-targeting guide RNA) as source cells and perturbed cells as target cells.

Schmidt et al

This dataset (GSE190604) measures the effects of CRISPR-mediated activation perturbations in human primary T cells under both stimulated and resting conditions. For our analysis, we excluded cells not mentioned in metadata and removed genes appeared in less than 50 cells. Gene expression levels of single guide RNA were deleted to avoid data leakage. We used unperturbed cells as source cells and perturbed cells as target cells. We also excluded cells without significant changes after perturbation following the procedure proposed by Mixscape tutorial via package pertpy. Default parameters were used for Mixscape analysis.

Cano-Gamez et al

This dataset (EGAS00001003215) comprises naive and memory T cells induced by several sets of cytokines. With cytokine stimulation, T cells are expected to differentiate into different subtypes. We took cells not treated by cytokines as source cells and cytokine-stimulated cells as target cells. This experimental set reflects the differential process of human T cells. For our analysis, clusters 14–17 were excluded because their source cells could not be reliably determined.

Fernandes et al

This dataset comprises heterogeneous dopamine neurons derived from human iPS cells. These neurons were exposed to oxidative stress and ER stress, representing PD-like phenotypes. We followed preprocessing procedures as mentioned in the original GitHub repo (https://github.com/metzakopian-lab/DNscRNAseq/blob/master/preprocessing.ipynb).

Tian et al

This dataset comprises iPS cell-derived neurons perturbed by more than 180 genes related to neurodegenerative diseases. CRISPR interference experiments with single-cell transcriptomic readouts were conducted by CRISPR droplet sequencing (CROP-seq). For our analysis, we removed genes that appeared in fewer than 50 cells. We used unperturbed cells as source cells and perturbed cells as target cells.

Srivatsan et al

This dataset (GSE139944) contains transcriptomic profiles of human cell lines perturbed by compounds. For our study, we utilized K562 cell line cells perturbed by HDAC inhibitors. We used unperturbed cells as source cells, and chemically perturbed cells as target cells. This set represents the process of cellular transition caused by drugs.

Kowalski et al

This dataset (GEO: GSE269600) measures the transcriptional consequences of CRISPR-mediated perturbations in HEK293FT and K562 cells. For our analysis, we excluded perturbations that consisted of fewer than 200 cells. Cells with minimal perturbation effects were removed from downstream analysis. We used cells from control groups as source cells, and perturbed cells as target cells.

T cell differentiation analysis

Identification of cellular phenotypic shift

We computed the transition score to identify cellular phenotypic shifts on the transcriptomic level. We selected canonical marker genes associated with IFNG and IL2 secretion and Th2 differentiation. Then, we computed the transition score based on the mean expression level of these marker genes, that is, ${\rm{CT}}{{\rm{S}}}_{{ij}}=\frac{1}{{K}_{i}}\mathop{\sum }\nolimits_{{k}_{i}=1}^{{K}_{i}}({g}_{{j}_{1}{k}_{i}}-{g}_{{j}_{0}{k}_{i}})$, where ${\rm{CT}}{{\rm{S}}}_{{ij}}$ is the transition score of phenotypic shift type i in source–target cell pair $j$, ${g}_{{j}_{1}{k}_{i}}$ and ${g}_{{j}_{0}{k}_{i}}$ are normalized gene expression levels of marker gene k for cell-state transition type i in target cell j₁ and its source cell j₀. The total number of marker genes for phenotypic shift type i is represented with K_i. Then, classes of phenotypic changes were annotated on the basis of transition score. Transition scores are calculated via function tl.score_genes from scanpy package with default parameters.

Cell type classification with predicted driver genes

We selected a series of genes related to transition mentioned above from previous studies (Supplementary Table 4). We used the term ‘likelihood scores’ to describe the probability of a gene to be a driver factor predicted by the model, that is,

$${{\rm{LS}}}_{{jk}}^{\mathrm{mod}}={p}_{{jk}}^{\mathrm{mod}},$$

where ${{\rm{LS}}}_{{jk}}^{\mathrm{mod}}$ means the likelihood score for gene $k$ in source–target cell pair $j$ from model $\mathrm{mod}$, and ${p}_{{jk}}^{\mathrm{mod}}$ represents the probability predicted by model $\mathrm{mod}$. For our analysis, the $\mathrm{mod}$ could be CellNavi, or baseline models.

Then, we aggregate likelihood scores into ‘prediction scores’ to evaluate the performance of different models:

$${{\rm{PS}}}_{{ij}}^{\mathrm{mod}}=\frac{1}{{m}_{i}}\mathop{\sum }\limits_{{k}_{i}=1}^{{m}_{i}}{{\rm{LS}}}_{{jk}}^{\mathrm{mod}},$$

where ${{\rm{PS}}}_{{ij}}^{\mathrm{mod}}$ is the prediction score of cell-state transition type i in source–target cell pair j predicted by model mod. The number of candidate driver genes for each phenotypic changing type i is m_i. Ideal prediction should reflect similar patterns as shown by the cellular transition score mentioned above. To evaluate it quantitatively, we trained decision tree classifiers with prediction scores as input to test whether predictions scores would faithfully demonstrate cell-state transition types. Classifiers were trained for each method independently, and tenfold cross-validation was conducted. Classifiers were implemented via shallow decision trees using the sklearn package.

GO enrichment analysis

We used GO enrichment analysis to explore drugs’ mechanisms of action. For each drug compound, the top 50 genes with highest scores predicted by CellNavi were used for GO enrichment analysis. The significant level was chosen to be 0.05, and the Benjamini–Hochberg procedure was used to control the false discovery rate. For implementation, we used package goatools for GO enrichment analysis.

Molecular docking

We performed molecular docking for panobinostat and tucidinostat, with a reference protein structure obtained from the PDB entry 3MAX. The ligand structures from PDB entries 3MAX and 5G3W were used to guide the initial placement of panobinostat and tucidinostat, ensuring the pose correctness of the warheads and major scaffolds. Based on such initial poses, local optimizations were performed with AutoDock Vina. PyMol was used for structure visualization.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

HCA data for CMM training were downloaded from the HCA data explorer (https://explore.data.humancellatlas.org/projects). The Norman et al.³², Tian et al.⁵² and Srivatsan et al.⁵⁸ datasets were downloaded from the scPerturb project⁷⁹ via Zenodo at https://doi.org/10.5281/zenodo.7041848 (ref. ⁸⁰). The raw count data of Schmidt et al.³⁰ dataset were downloaded from the National Institutes of Health GEO with accession number GSE190604, and its metadata were downloaded via Zenodo at https://doi.org/10.5281/zenodo.5784650 (ref. ⁸¹). The Cano-Gamez et al.³⁸ dataset was downloaded from the Open Target Platform of this project (https://www.opentargets.org/projects/effectorness). The Fernandes et al.⁵⁰ dataset was downloaded from ArrayExpress with accession number E-MTAB-9154. The preprocessed Kowalski et al.⁶⁷ dataset was downloaded via Zenodo at https://doi.org/10.5281/zenodo.7619592 (ref. ⁸²). For trajectory reconstruction, we used the dataset from GSE132188. For single-cell RNA-seq alignment across varying sequencing depths, we used data from GSE84133, specifically the Human3 sample. PDB 3MAX, 5G3W. Source data are provided with this paper.

Code availability

Custom code developed in this study is available via GitHub at https://github.com/DLS5-Omics/CellNavi. Additional software packages for modelling and data analysis include the following: python = =3.8.19, torch = =2.4.0, pandas=2.2.2, numpy = =1.23.5, scikit-learn = =1.5.1, scipy = =1.14.1, networkx = =3.3, scanpy = =1.10.3. cell-gears = =0.1.2, pyscenic = =0.12.1, renge = =0.0.3, cell-gears=0.1.1, goatools = =1.4.12, pertpy (2024.04).

References

Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861–872 (2007).
Article CAS PubMed Google Scholar
Li, P. et al. Reprogramming of T cells to natural killer–like cells upon Bcl11b deletion. Science 329, 85–89 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zaret, K. S. & Carroll, J. S. Pioneer transcription factors: establishing competence for gene expression. Genes Dev. 25, 2227–2241 (2011).
Article CAS PubMed PubMed Central Google Scholar
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bravo González-Blas, C. et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods 20, 1355–1367 (2023).
Article PubMed PubMed Central Google Scholar
Wang, P. et al. Deciphering driver regulators of cell fate decisions from single-cell transcriptomics data with CEFCON. Nat. Commun. 14, 8459 (2023).
Article CAS PubMed PubMed Central Google Scholar
Fleck, J. S. et al. Inferring and perturbing cell fate regulomes in human brain organoids. Nature 621, 365–372 (2023).
Article CAS PubMed Google Scholar
Yuan, Q. & Duren, Z. Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02182-7 (2024).
Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A. & Murali, T. M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020).
Article CAS PubMed PubMed Central Google Scholar
Badia-i-Mompel, P. et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat. Rev. Genet. 24, 739–754 (2023).
Article CAS PubMed Google Scholar
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
Article PubMed PubMed Central Google Scholar
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature https://doi.org/10.1038/s41586-023-06139-9 (2023).
Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods https://doi.org/10.1038/s41592-024-02201-0 (2024).
Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).
Article Google Scholar
Zhao, W. X. et al. A survey of large language models. Preprint at https://arxiv.org/abs/2303.18223 (2023).
Kaddour, J. et al. Challenges and applications of large language models. Preprint at http://arxiv.org/abs/2307.10169 (2023).
Hao, M. et al. Large-scale foundation model on single-cell transcriptomics. Nat. Methods https://doi.org/10.1038/s41592-024-02305-7 (2024).
Browaeys, R., Saelens, W. & Saeys, Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat. Methods 17, 159–162 (2020).
Article CAS PubMed Google Scholar
Ying, C. et al. Do transformers really perform bad for graph representation? Adv. Neural Inf. Process Syst. 34, 28877–28888 (2021).
Google Scholar
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).
Article CAS PubMed PubMed Central Google Scholar
Adamson, B. et al. perturbation—a multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882 (2016).
Article CAS PubMed PubMed Central Google Scholar
Datlinger, P. et al. Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing. Nat. Methods 18, 635–642 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016).
Article CAS PubMed Google Scholar
Replogle, J. M. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat. Biotechnol. 38, 954–961 (2020).
Article CAS PubMed PubMed Central Google Scholar
Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).
Article CAS PubMed PubMed Central Google Scholar
Schmidt, R. et al. CRISPR activation and interference screens decode stimulation responses in primary human T cells. Science 375, eabj4008 (2022).
Article CAS PubMed PubMed Central Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. 42, 927–935 (2024).
Article CAS PubMed Google Scholar
Huynh-Thu, V. A., Irrthum, A., Wehenkel, L. & Geurts, P. Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5, e12776 (2010).
Article PubMed PubMed Central Google Scholar
Moerman, T. et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics 35, 2159–2161 (2019).
Article CAS PubMed Google Scholar
Ishikawa, M. et al. RENGE infers gene regulatory networks using time-series single-cell RNA-seq data with CRISPR perturbations. Commun. Biol. 6, 1290 (2023).
Article CAS PubMed PubMed Central Google Scholar
Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793 (2019).
Article CAS PubMed PubMed Central Google Scholar
Traag, V., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep. 9, 5233 (2019).
Chen, Y. & Zou, J. Simple and effective embedding model for single-cell biology built from ChatGPT. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-024-01284-6 (2024).
Kedzierska, K. Z., Crawford, L., Amini, A. P. & Lu, A. X. Zero-shot evaluation reveals limitations of single-cell foundation models. Genome Biol. 26, 101 (2025).
Liu, T., Li, K., Wang, Y., Li, H. & Zhao, H. Evaluating the utilities of foundation models in single-cell data analysis. Preprint at bioRxiv https://doi.org/10.1101/2023.09.08.555192 (2023).
Boiarsky, R., Singh, N., Buendia, A., Getz, G. & Sontag, D. A deep dive into single-cell RNA sequencing foundation models. Preprint at bioRxiv https://doi.org/10.1101/2023.10.19.563100 (2023).
Cano-Gamez, E. et al. Single-cell transcriptomics identifies an effectorness gradient shaping the response of CD4⁺ T cells to cytokines. Nat. Commun. 11, 1801 (2020).
Article CAS PubMed PubMed Central Google Scholar
Szabo, P. A. et al. Single-cell transcriptomics of human T cells reveals tissue and activation signatures in health and disease. Nat. Commun. 10, 4706 (2019).
Article CAS PubMed PubMed Central Google Scholar
Atlasy, N. et al. Single cell transcriptomic analysis of the immune cell compartment in the human small intestine and in celiac disease. Nat. Commun. 13, 4920 (2022).
Article CAS PubMed PubMed Central Google Scholar
Cachot, A. et al. Tumor-specific cytolytic CD4 T cells mediate immunity against human cancer. Sci. Adv. 7, eabe3348 (2021).
Article CAS PubMed PubMed Central Google Scholar
Henriksson, J. et al. Genome-wide CRISPR screens in T helper cells reveal pervasive crosstalk between activation and differentiation. Cell 176, 882–896 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kanhere, A. et al. T-bet and GATA3 orchestrate Th1 and Th2 differentiation through lineage-specific targeting of distal regulatory elements. Nat. Commun. 3, 1268 (2012).
Article PubMed Google Scholar
Kalbasi, A. et al. Potentiating adoptive cell therapy using synthetic IL-9 receptors. Nature 607, 360–365 (2022).
Article CAS PubMed PubMed Central Google Scholar
Charvet, C. et al. Vav1 promotes T cell cycle progression by linking TCR/CD28 costimulation to FOXO1 and p27kip1 expression. J. Immunol. 177, 5024–5031 (2006).
Article CAS PubMed Google Scholar
Fischer, K.-D. et al. Defective T-cell receptor signalling and positive selection of Vav-deficient CD4⁺CDS⁺ thymocytes. Nature 374, 474–476 (1995).
Article CAS PubMed Google Scholar
Kane, L. P., Andres, P. G., Howland, K. C., Abbas, A. K. & Weiss, A. Akt provides the CD28 costimulatory signal for up-regulation of IL-2 and IFN-γ but not TH2 cytokines. Nat. Immunol. 2, 37–44 (2001).
Article CAS PubMed Google Scholar
Jaeger-Ruckstuhl, C. A. et al. Signaling via a CD27–TRAF2–SHP-1 axis during naive T cell activation promotes memory-associated gene regulatory networks. Immunity 57, 287–302 (2024).
Article CAS PubMed PubMed Central Google Scholar
Lim, H.-S. et al. Costimulation of IL-2 production through CD28 is dependent on the size of its ligand. J. Immunol. 195, 5432–5439 (2015).
Article CAS PubMed PubMed Central Google Scholar
Fernandes, H. J. R. et al. Single-cell transcriptomics of parkinson’s disease human in vitro models reveals dopamine neuron-specific stress responses. Cell Rep. 33, 108263 (2020).
Article CAS PubMed Google Scholar
Surani, M. A. Glycoprotein synthesis and inhibition of glycosylation by tunicamycin in preimplantation mouse embryos: compaction and trophoblast adhesion. Cell 18, 217–227 (1979).
Article CAS PubMed Google Scholar
Tian, R. et al. Genome-wide CRISPRi/a screens in human neurons link lysosomal failure to ferroptosis. Nat. Neurosci. 24, 1020–1034 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tian, R. et al. CRISPR interference-based platform for multimodal genetic screens in human iPSC-derived neurons. Neuron 104, 239–255 (2019).
Article CAS PubMed PubMed Central Google Scholar
Quirós, P. M., Mottis, A. & Auwerx, J. Mitonuclear communication in homeostasis and stress. Nat. Rev. Mol. Cell Biol. 17, 213–226 (2016).
Article PubMed Google Scholar
Ho, T. C. S., Chan, A. H. Y. & Ganesan, A. Thirty years of HDAC inhibitors: 2020 insight and hindsight. J. Med. Chem. 63, 12460–12484 (2020).
Article CAS PubMed Google Scholar
Glozak, M. A. & Seto, E. Histone deacetylases and cancer. Oncogene 26, 5420–5432 (2007).
Article CAS PubMed Google Scholar
Marks, P. A. et al. Histone deacetylases and cancer: causes and therapies. Nat. Rev. Cancer 1, 194–202 (2001).
Article CAS PubMed Google Scholar
Srivatsan, S. R. et al. Massively multiplex chemical transcriptomics at single-cell resolution. Science 367, 45–51 (2020).
Article CAS PubMed Google Scholar
Bradley, R. K. & Anczuków, O. RNA splicing dysregulation and the hallmarks of cancer. Nat. Rev. Cancer 23, 135–155 (2023).
Article CAS PubMed PubMed Central Google Scholar
Stanley, R. F. & Abdel-Wahab, O. Dysregulation and therapeutic targeting of RNA splicing in cancer. Nat. Cancer 3, 536–546 (2022).
Bonnal, S., Vigevani, L. & Valcárcel, J. The spliceosome as a target of novel antitumour drugs. Nat. Rev. Drug Discov. 11, 847–859 (2012).
Article CAS PubMed Google Scholar
Di Micco, S. et al. Structural basis for the design and synthesis of selective HDAC inhibitors. Bioorg. Med. Chem. 21, 3795–3807 (2013).
Article PubMed Google Scholar
Perszyk, R. E. et al. Biased modulators of NMDA receptors control channel opening and ion selectivity. Nat. Chem. Biol. 16, 188–196 (2020).
Article CAS PubMed PubMed Central Google Scholar
Franks, L. N., Ford, B. M., Fujiwara, T., Zhao, H. & Prather, P. L. The tamoxifen derivative ridaifen-B is a high affinity selective CB2 receptor inverse agonist exhibiting anti-inflammatory and anti-osteoclastogenic effects. Toxicol. Appl. Pharmacol. 353, 31–42 (2018).
Article CAS PubMed PubMed Central Google Scholar
Samad, S. S., Schwartz, J.-M. & Francavilla, C. Functional selectivity of receptor tyrosine kinases regulates distinct cellular outputs. Front. Cell Dev. Biol. 11, (2024).
Bock, A. et al. The allosteric vestibule of a seven transmembrane helical receptor controls G-protein coupling. Nat. Commun. 3, 1044 (2012).
Article PubMed Google Scholar
Kowalski, M. H. et al. Multiplexed single-cell characterization of alternative polyadenylation regulators. Cell 187, 4408–4425.e23 (2024).
Article CAS PubMed PubMed Central Google Scholar
Ali, I. et al. Crosstalk between RNA Pol II C-terminal domain acetylation and phosphorylation via RPRD proteins. Molecular Cell 74, 1164–1174 (2019).
Article CAS PubMed PubMed Central Google Scholar
Joung, J. et al. A transcription factor atlas of directed differentiation. Cell 186, 209–229 (2023).
Article CAS PubMed PubMed Central Google Scholar
Rukhlenko, O. S. et al. Control of cell state transitions. Nature 609, 975–985 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wang, J., Zhang, K., Xu, L. & Wang, E. Quantifying the Waddington landscape and biological paths for development and differentiation. Proc. Natl Acad. Sci. USA 108, 8257–8262 (2011).
Article CAS PubMed PubMed Central Google Scholar
Hormoz, S. et al. Inferring cell-state transition dynamics from lineage trees and endpoint single-cell measurements. Cell Syst. 3, 419–433 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rafelski, S. M. & Theriot, J. A. Establishing a conceptual framework for holistic cell states and state transitions. Cell 187, 2633–2651 (2024).
Article CAS PubMed Google Scholar
Qiu, X. et al. Mapping transcriptomic vector fields of single cells. Cell 185, 690–711 (2022).
Article CAS PubMed PubMed Central Google Scholar
Liang, S. et al. Single-cell manifold-preserving feature selection for detecting rare cell populations. Nat. Comput. Sci. 1, 374–384 (2021).
Article PubMed PubMed Central Google Scholar
Yao, Z., Li, B., Lu, Y. & Yau, S.-T. Single-cell analysis via manifold fitting: a framework for RNA clustering and beyond. Proc. Natl Acad. Sci. USA 121, e2400002121 (2024).
Article PubMed PubMed Central Google Scholar
Li, M. M. et al. Contextual AI models for single-cell protein biology. Nat. Methods 21, 1546–1557 (2024).
Article CAS PubMed PubMed Central Google Scholar
Feng, G. et al. CellPolaris: decoding cell fate through generalization transfer learning of gene regulatory networks. Preprint at bioRxiv https://doi.org/10.1101/2023.09.25.559244 (2023).
Peidli, S. et al. scPerturb: harmonized single-cell perturbation data. Nat. Methods 21, 531–540 (2024).
Article CAS PubMed PubMed Central Google Scholar
Peidli, S. et al. scPerturb single-cell perturbation data: RNA and protein h5ad files. Zenodo https://doi.org/10.5281/zenodo.7041848 (2022).
Steinhart, Z. Code repository for CRISPR activation and interference screens decode stimulation responses in primary human T cells. Zenodo https://doi.org/10.5281/zenodo.5784650 (2022).
Kowalski, M. H. et al. Multiplexed single-cell characterization of alternative polyadenylation regulators (HEK293FT & K562 Perturb-seq data). Zenodo https://doi.org/10.5281/zenodo.7619592 (2024).
Baron, M. et al. A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure. Cell Syst. 3, 346–360 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bastidas-Ponce, A. et al. Comprehensive single cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis. Development 146, dev173849 (2019).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank Z. Chen, S. Shao, C. Cao and Z. Tang for constructive suggestions and feedback on the manuscript and T.-Y. Liu for the supervision. We thank J. Bai for graphic designs and illustrations.

Author information

Shuxin Zheng, Yaosen Min & Huanhuan Xia
Present address: Zhongguancun Institute of Artificial Intelligence and Beijing Zhongguancun Academy, Beijing, China
These authors contributed equally: Tianze Wang, Yan Pan, Fusong Ju, Shuxin Zheng, Chang Liu.

Authors and Affiliations

Microsoft Research AI for Science, Beijing, China
Tianze Wang, Yan Pan, Fusong Ju, Shuxin Zheng, Chang Liu, Yaosen Min, Qun Jiang, Xinwei Liu, Huanhuan Xia, Guoqing Liu, Haiguang Liu & Pan Deng
Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
Tianze Wang
Department of Automation, Tsinghua University, Beijing, China
Yan Pan & Qun Jiang
Department of Physics, Beijing Normal University, Beijing, China
Xinwei Liu

Authors

Tianze Wang
View author publications
Search author on:PubMed Google Scholar
Yan Pan
View author publications
Search author on:PubMed Google Scholar
Fusong Ju
View author publications
Search author on:PubMed Google Scholar
Shuxin Zheng
View author publications
Search author on:PubMed Google Scholar
Chang Liu
View author publications
Search author on:PubMed Google Scholar
Yaosen Min
View author publications
Search author on:PubMed Google Scholar
Qun Jiang
View author publications
Search author on:PubMed Google Scholar
Xinwei Liu
View author publications
Search author on:PubMed Google Scholar
Huanhuan Xia
View author publications
Search author on:PubMed Google Scholar
Guoqing Liu
View author publications
Search author on:PubMed Google Scholar
Haiguang Liu
View author publications
Search author on:PubMed Google Scholar
Pan Deng
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, P.D., T.W., S.Z. and C.L.; data curation, T.W. and P.D.; methodology, Y.P., F.J., T.W., S.Z., C.L. and P.D.; model implementation—pretrain, F.J., Y.P., G.L. and H.X.; model implementation—fine-tuning and inference, Y.P., F.J., T.W., C.L. and Q.J.; result analysis, T.W., Y.P., F.J., P.D., Y.M. and X.L.; result interpretation, T.W., P.D., Y.P., F.J. and Y.M.; writing—original draft, P.D., T.W., C.L., S.Z., F.J., Y.M. and Y.P.; writing—revision, P.D., C.L., S.Z., H.L., Y.P. and H.X.; supervision, H.L. All authors have read and approved the paper.

Corresponding author

Correspondence to Pan Deng.

Ethics declarations

Competing interests

P.D., F.J., C.L., G.L. and H.L. are paid employees of Microsoft Research. The other authors declare no competing interests.

Peer review

Peer review information

Nature Cell Biology thanks Ivan Costa, Zhana Duren and Marc Sturrock for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Details of the Cell Manifold Model (CMM) implementation.

a) The CMM is designed to reconstruct gene expression profiles by leveraging a Transformer variant composed of GeneGraph Attention Layers. b) GeneGraph Attention Layer integrates prior gene graph via its multi-head attention layer. To be noted, only sub gene graphs with nodes in the input sample (all non-zero genes) are used for attention calculation.

Extended Data Fig. 2 CellNavi can align the same cell types across varying sequencing depths and reconstruct developmental trajectories.

(a) Upper panel: After 20-fold down-sampling on the expression profile⁸³, major cell types remained distinguishable, but a severe batch effect was observed even after using standard integration methods. Lower panel: the Cell Manifold Model (CMM) in CellNavi can integrate embeddings from the down-sampled and original profiles, while preserving the biological structure. Integration quality was quantitatively assessed using iLISI (1: worst, 2: best) and cLISI (1: best, 2: worst), which evaluate batch mixing and cell type separation, respectively. See details of experimental rationales in Supplementary Note 2. (b) UMAPs colored by cell type (left) and diffusion pseudotime (DPT) (right, indicated by the colorbars), inferred from the original data (upper panel) and CellNavi-predicted data (lower panel). Cell type labels were assigned based on known markers, and pseudotime was computed using Scanpy’s DPT implementation. (c) Spearman correlation between pseudotime inferred from the original data (x-axis) and from the CellNavi-predicted data (y-axis) with a quantification in d. (d) Summary of quantitative evaluation using Spearman correlation, along with clustering evaluation metrics including Adjusted Rand Index (ARI), Normalized Mutual Information (NMI), Average Silhouette Score (ASS), and Fowlkes-Mallows Score (FMS). Scores computed from random embeddings are included for comparison. We used a mouse pancreatic endocrinogenesis dataset from Bastidas-Ponce et al.⁸⁴. Therefore, we enhanced our model with mouse scRNA-seq for this task.

Extended Data Fig. 3 Details of the Driver Gene Predictor (DGP) implementation.

(a) The DGP processes concatenated cell coordinates pairs output by the CMM to predict driver genes by generating a likelihood score vector. An optional discriminator is included to align training and test data, ensuring consistency and accuracy in predictions (Supplementary Note 4). (b) The fine-tuning data (left) consists of CRISPR perturbation datasets with low heterogeneity, typically derived from controlled experiments with limited variability. The application data (right) encompasses diverse biological conditions with high heterogeneity, including different cell types, perturbation types, and sequencing platforms. The bidirectional arrow indicates the challenges posed by differences in data sources, sequencing techniques, and biological variability when generalizing the model from fine-tuning to real-world applications. Schematic elements created with BioRender.com.

Extended Data Fig. 4 Quantitative assessment of CellNavi (extended).

(a) UMAP visualization of perturbed resting T cells and restimulated T cells sequenced by Schmidt et al, colored by cell clusters identified in the original study. (b) AUPRC (Area Under Precision-Recall Curve) and MCC (Matthews Correlation Coefficient) for driver gene prediction in the Schmidt dataset, comparing CellNavi with alternative methods. Source data are available in Supplementary Table 1. (c) Compatibility of the test dataset to SCENIC/SCENIC+. Upper panel: 61.6% of samples cannot be predicted by SCENIC/SCENIC+ due to missing regulons. Bottom panel: 58.3% of candidate driver genes are excluded because they are neither transcription factor (TF) nor TF-target genes. (d) Micro-AUROC of CellNavi applied to the original single-cell transcriptomic profile and the filtered transcriptomic profile in which perturbed genes are excluded, showing that CellNavi identifies driver genes excluded from gene expression profiles. (e) Top K (K = 1 or 5) accuracy of CellNavi applied to transcriptomic data with perturbed gene excluded. Error bar, standard error. n = 69. (f) Cells were stratified into distinct states using the unsupervised Leiden algorithm, with each state represented by a different color. A specific cell cluster was selected as the test set, while the remaining clusters were used for training. To ensure rigorous evaluation, all multi-gene perturbations were excluded from training. Consequently, the training set (light gray) consisted only of single-gene perturbations within certain clusters, while the test set was divided into: 1) single-gene perturbations from the held-out cluster (light blue), 2) double-gene perturbations from the held-out cluster (dark blue), and 3) double-gene perturbations from the training clusters (dark gray). Results reported in the main text are derived from the held-out cluster (blue cluster in the UMAP). To further ensure fairness, the test cluster was shuffled (similar to cross-validation) to obtain robust and unbiased results (Supplementary Table 3). (g) AUPRC (Area Under Precision-Recall Curve) and MCC (Matthews Correlation Coefficient) for driver gene prediction in the Norman dataset (single perturbation), comparing CellNavi with alternative methods. Source data are available in Supplementary Table 1.

Extended Data Fig. 5 Ablation study on CellNavi components.

The figure compares the performance of CellNavi under three configurations: (1) CellNavi with both CMM pretraining and DGP fine-tuning, (2) coupling the DGP with raw gene expression vectors instead of gene embeddings produced by the CMM, and (3) replacing the DGP with a simpler multinomial logistic regression model on top of the CMM. Performance is assessed under two evaluation settings: out-of-domain (the previously used Norman single perturbation split that holds one cluster out from training, red) and in-domain (a random split on the Norman dataset with the same train/test size, gray). Metrics include Top-1 accuracy, Top-5 accuracy, AUROC, AUPRC, F1 score, and MCC (Matthews Correlation Coefficient). Dashed lines indicate performance change between evaluation settings.

Extended Data Fig. 6 Transcriptional changes of canonical marker genes in IL2-high, IFNG-high, and Th2 cell types.

IL2-high marker genes: G0S2, IL2, IL21, and TAGAP. IFNG-high marker genes: IFNG, CCL3, CCL4, and CCL3L3. Th2 marker genes: GATA3, MRPS26, LIMA1 and SPINT2.

Extended Data Fig. 7 Expression changes for the top-20 predicted genes across cell pairs.

Center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. n = 47,437. Numerical data are available in the SI source data file.

Extended Data Fig. 8 Stratification of HDAC inhibitor-treated K62 cells.

(a) UMAP visualization of single-cell transcriptomic profiles from K562 cells treated with 17 distinct HDAC inhibitors. (b) The percentage of cells treated with different drug compounds within each cluster identified in Fig. 5a.

Extended Data Fig. 9 Heatmap showing gene expression across different perturbation groups for K562 cells.

Rows represent genes, and columns represent true perturbations (a) or predicted perturbations by CellNavi (b).

Extended Data Table 1 Comparison of the predictive performance of CellNavi using NicheNet and alternative graph configurations

Full size table

Supplementary information

Supplementary Information

Supplementary Notes 1–4, Tables 1–6 and Figs. 1–6.

Reporting Summary

Peer Review File

Supplementary Data

Source data for supplementary figures

Source data

Source Data

Source data for Figs. 1–6 and extended data figures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, T., Pan, Y., Ju, F. et al. CellNavi predicts genes directing cellular transitions by learning a gene graph-enhanced cell state manifold. Nat Cell Biol 27, 1863–1874 (2025). https://doi.org/10.1038/s41556-025-01755-1

Download citation

Received: 28 October 2024
Accepted: 30 July 2025
Published: 03 October 2025
Version of record: 03 October 2025
Issue date: October 2025
DOI: https://doi.org/10.1038/s41556-025-01755-1