Fig. 1: Overview of the boosting autoencoder approach. | Communications Biology

Fig. 1: Overview of the boosting autoencoder approach.

From: Infusing structural assumptions into dimensionality reduction for single-cell RNA sequencing data to identify small gene sets

Fig. 1

Top row: Overview of the boosting autoencoder (BAE) architecture. A gene-by-cell matrix is mapped to a low-dimensional latent space via multiplication by a sparse encoder weight matrix fitted via a componentwise boosting approach, which can be constrained to encode additional assumptions (illustrated by the exclamation mark). Middle row: Training process with a constraint for disentangled latent dimensions. The encoder weights are initialized as zero (1), and for the first latent dimension, one coefficient of B is updated to a nonzero value (2). In all subsequent updates, only variables complementary to the ones already selected may be selected (3, constraint indicated by the exclamation mark). After updating one coefficient for each dimension, the decoder parameters θ are updated via gradient descent (4). Steps (2)–(4) are repeated in subsequent training epochs. Bottom row: Adding a constraint for coupling time points. For extracting differentiation trajectories from time series scRNA-seq data, a BAE is trained at each time point (1–3). The encoder weight matrix, trained with the disentanglement constraint (red exclamation mark), is passed to the subsequent time point in a pre-training strategy (blue exclamation mark), to couple dimensions corresponding to the same developmental pattern across time (indicated by the dashed box around the second dimension).

Back to article page