Fig. 4: Important tokens identified from captions in the Poly-Caption dataset. | npj Computational Materials