Fig. 3: Text-guided compound generation.
From: Exploration of crystal chemical space using text-guided generative artificial intelligence

a A box plot of the accumulated composition-matching ratio as a function of the number of atoms. The sample sizes (n) are: n = 61 for structures with fewer than 10 atoms, n = 300 for structures with fewer than 20 atoms, n = 340 for structures with fewer than 30 atoms, and n = 707 for structures with fewer than 40 atoms. Boxes span the 25th to 75th percentiles (the interquartile range), horizontal lines mark the median, whiskers extend to 1.5 × the interquartile range. b A box plot of composition and crystal system matching ratio of the Crystal CLIP model based on different prompt types for structures containing fewer than 20 atoms, with a total of 300 generated structures. The box plot follows the same statistical format and visualization parameters as used in (a). c A t-SNE plot of compositional embeddings generated by Magpie for train, test, and generated structures from the Baseline BERT and Crystal CLIP models, using general text descriptions. The generated structures that differ from the test set are highlighted with red edges (non-overlapping). d A t-SNE plot of structural embeddings generated by CrystalNN Fingerprint, showing train, test, and generated structures using general text descriptions. The source data are provided as a Source Data file 1.