Fig. 1: Overview of the CLOOME framework.
From: CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures

a, b The CLOOME encoders can be used to query a bioimaging database (a) by a chemical structure, and vice versa, query a chemical database by a microscopy image (b). c Visualization of the embedding space in terms of a t-SNE projection of image embeddings of new cell phenotypes. Each point represents a microscopy image from a hold-out set. The color indicates the cell phenotype, which was also withheld from training. The CLOOME embeddings (left) are indicative of the cell phenotype (clustered colors). CellProfiler features are less indicative of cell phenotypes (only a few colors cluster together). d A multi-modal setting for imaging cell phenotypes. Small molecules are administered to cells which are then imaged to capture potential phenotypic effects. In this way, matched image-structure pairs are obtained. e Schematic depiction of the training procedure of CLOOME. During training, the similarity of matched image-structure pairs is increased (black arrows), while the similarity of un-matched image-structure pairs is decreased (gray arrows). f The encoders of CLOOME map chemical structures and microscopy images to the same embedding space using a structure and a microscopy image encoder. Both encoders are deep neural networks. Matched pairs of chemical structures and microscopy images are mapped to embeddings that are close together, whereas un-matched pairs are mapped to embeddings that are separated. Source data are provided as a Source data file.