Fig. 1: Schematic of multiDGD’s architecture and generative process.
From: multiDGD: A versatile deep generative model for multi-omics data

Representations Z presents the input to the decoder. They are distributed in latent space according to a Gaussian mixture model (GMM) parameterized by ϕ. Zbasal and ϕ present the unsupervised basal embedding and its distribution, respectively. We refer to this part as the latent model. A novelty in multiDGD is the covariate model. Zcov and ϕcov present the supervised representations and GMM for a given categorical covariate. For each data point (cell) i ∈ N, there exists a latent vector of length L, plus 2 dimensions for each covariate modeled. The input is transformed into modality-specific predicted normalized mean counts y through the branched decoder θ. These outputs are then scaled with sample-wise count depths to predict the density of both RNA and ATAC data. Red arrows depict the backpropagation and updating of parameters during training.