Fig. 4: Using GraphVelo velocities to infer host-virus infection trajectory and identify host-pathogen interactions.

a Viral infection captured by the GraphVelo velocity field. Cells were colored by the percentage of viral RNA within a single cell. b Correlation between viral RNA percentage and pseudotime inferred by scVelo or GraphVelo. Correlation was calculated using a two-sided Spearman rank correlation test. c Viral RNA velocities infered by GraphVelo along the viral RNA percentage axis. The black dot line highlights the zero velocity. The solid line denotes the mean trend, dashed lines denote 1 s.d. d Boxplot summarizing the MacK scores of all viral genes calculated by GraphVelo, CellRank pseudotime kernel and random predictor. The number of viral genes is 67 for the statistical test. *** indicates Welch’s independent two-sided t-test at p < 0.05. Boxplots indicate median (middle line), first and third quartiles (box), and the upper whisker extends from the edges to the largest value no further than 1.5 × IQR (interquartile range) from the quartiles and the lower whisker extends from the edge to the smallest value at most 1.5 × IQR of the edge, while data beyond the end of the whiskers are outlying points that are plotted individually. e Correlation between viral infection speed and RNA abundance. Genes were ranked by Spearman correlatioin coefficients. Host and viral genes that contribute to viral DNA synthesis were marked in the left side and those contribute to viral defense response were marked in the right side. Viral genes were highlighted in red. f UMAP representation of host and viral genes with distances defined by their dynamic expression patterns along the viral RNA percentage axis. g Example dynamic expression patterns within specific clusters (Leiden4, 5, 3, 6 from top to bottom) along the viral RNA percentage axis. Zero velocity was highlighted by black dot line. The solid line denotes mean trend, Shaded region represents 1 s.d. h GO enrichment of each cluster in (g). Gene set enrichment was performed using one-sided Fisher’s exact test, Benjamini–Hochberg correction. Adjusted p-values represent FDR-corrected significance of gene set enrichment. i Top host genes inhibited by each viral factor based on dynamo Jacobian analyses. Host effectors were organized by their involved pathways. j Dynamo prediction of total viral RNA change in response to in silico viral factor knockout. Viral factors were ranked by the mean of total viral RNA changes. The number of samples is 1454 for virtual perturbation screen. Boxplots indicate median (middle line), first and third quartiles (box), and the upper whisker extends from the edges to the largest value no further than 1.5 × IQR (interquartile range) from the quartiles and the lower whisker extends from the edge to the smallest value at most 1.5 × IQR of the edge. k Vector field change resultant from infinitesmal inhibition of UL123 during the viral infection process. Source data are provided as a Source Data file.