Fig. 2: The representation disentanglement process of LEOPARD on the KORA multi-omics dataset.

a The normalized temperature-scaled cross-entropy (NT-Xent)-based contrastive loss is computed for content and temporal representations. b–c Uniform manifold approximation and projection (UMAP) embeddings of content (b) and temporal (c) representations at various training epochs are visualized for the KORA multi-omics dataset’s validation set. Representations encoded from data of \({{\rm{v}}}1\) and \({{\rm{v}}}2\) (metabolomics and proteomics, depicted by blue and red dots) at timepoints \({{\rm{t}}}1\) and \({{\rm{t}}}2\) (S4 and F4, depicted by dark- and light-colored dots) are plotted. The data of \({{\rm{v}}}2\) at \({{\rm{t}}}2\) are imputed data produced after each training epoch, while the other data are from the observed samples in the validation set. LEOPARD’s content and temporal encoders capture signals unique to omics-specific content and temporal variations. In b, as the training progresses, one cluster is formed by the data of \({{\rm{v}}}2\) at \({{\rm{t}}}1\) and \({{\rm{t}}}2\) (dark and light red dots), while the other cluster is formed by the data of \({{\rm{v}}}1\) at \({{\rm{t}}}1\) and \({{\rm{t}}}2\) (dark and light blue dots), indicating that the content encoder is able to encode timepoint-invariant content representations. Similarly, in (c), embeddings from the same timepoint cluster together. One cluster is formed by the data of \({{\rm{v}}}1\) and \({{\rm{v}}}2\) at \({{\rm{t}}}1\) (dark blue and red dots), and the other is formed by the data of \({{\rm{v}}}1\) and \({{\rm{v}}}2\) at \({{\rm{t}}}2\) (light blue and red dots). This demonstrates that LEOPARD can effectively factorize omics data into content and temporal representations. Source data are provided as a Source Data file.