Fig. 5
From: Domain adaptation of a SMILES chemical transformer to SELFIES with limited computational resources

Cosine similarity of various molecules against methane’s embedding, comparing the ChemBERTa-zinc-base-v1 model (top) to the SELFIES-domain-adapted model (bottom). Alkanes (ethane, propane, butane) show higher similarity than ring-containing or heteroatom-rich structures (benzene, phenol). The SELFIES-based model often yields sharper functional distinctions.