Fig. 2: Image reconstruction in MorphoGenie.
From: Generalizable morphological profiling of cells by interpretable unsupervised learning

a Hybrid disentanglement learning network architecture in MorphoGenie MorphoGenie employs a dual-step learning strategy, jointly optimizing disentanglement learning and high-fidelity image generation (Supplementary Fig. S1). The architecture consists of two sequential steps: Step 1: Disentangled representation learning. A VAE variant, FactorVAE, learns disentangled representations in the latent space using a probabilistic encoder. The decoder reconstructs images from the latent representation. Step 2: Image generation and disentangled information distillation. The disentangled representation is transferred to a GAN, where the generator produces synthetic images. A discriminator assesses the generated images, distinguishing them from real images. Additionally, the trained encoder (with fixed weights) distills disentangled information into the GAN, enhancing the alignment between latent vector sampling for real and generated images. b. Image reconstruction performance in MorphoGenie in four distinctively different cell image datasets: Quantitative phase images (QPI) of suspension cells (lung cancer cell type classification and cell-cycle progression assay, scale bar = 20 μm) and fluorescence images of adherent cells (Cell-Painting drug assay (scale bar=65 μm) and epithelial-to-mesenchymal transition (EMT) assay, scale bar = 30 μm). The lung cancer cell image datasets include three major histologically differentiated subtypes of lung cancer i.e., adenocarcinoma (LUAD: H1975), squamous cell carcinoma cell lines (LUSC: H2170), small cell lung cancer cells (SCLC: H526) are included. The cell-cycle datasets described the classified cell cycle stages (G1, S and G2 phase) of human breast cancer cells (MDA-MB231), scale bar = 20 μm. The Cell-Painting drug assay dataset includes the human osteosarcoma U2OS cell line with (drugged) and without (mock) the treatment of glucocorticoid receptor agonist. The EMT dataset includes the A549 cell line, labeled with endogenous vimentin–red fluorescent protein (VIM-RFP), at different states of EMT, i.e., epithelial (E), intermediate (I) and mesenchymal (M states). c Comparative analysis of image reconstruction performance. The analysis is based on a composite score taking into account Structural Similarity Index (SSIM), Mean Squared Error (MSE) and Fréchet Inception Distance (FID) (Supplementary Fig. S2, and see Methods). Source data are provided as a Source Data file.