Fig. 1: Overview of the MultiVeloVAE model.

a Diagrams of multi-omic dynamics assumption9 (Top) and MultiVeloVAE neural network architecture (Bottom). The network takes chromatin accessibility (c), unspliced RNA (u), and spliced RNA (s) values as input, along with optional sample covariates such as batch (b). The encoder network infers cell state and cell time for each cell from (c, u, s). c is optional, allowing inference from scRNA, multiome, or both kinds of data. Note that the latent time t is shared across all genes. The decoder network infers cell-specific and gene-specific chromatin state kc and transcription rate ρ from cell state and latent time (and optionally sample covariates b). The decoder then reconstructs (c, u, s) using the analytical solution to an ODE. b Table summarizing the advantages of MultiVeloVAE over existing velocity inference methods. c When integrating multiple datasets, MultiVeloVAE can integrate samples that differ in terms of technical effects such as library size or sequencing time, identify corresponding cell states across different biological contexts, and infer joint dynamics for all cells.