Fig. 2: Visualization of molecular representations learned by MLM-FG via UMAP.
From: Pre-trained molecular language models with random functional group masking

Representations are extracted from the downstream datasets without finetuned, which contains 312,879 unique molecules. Each point is colored by its corresponding molecular weight(g/mol).