Fig. 8: Training data size needed by LLM-Prop to achieve state-of-the-art (SOTA) results when predicting band gap, volume, and is-gap-direct. | npj Computational Materials

Fig. 8: Training data size needed by LLM-Prop to achieve state-of-the-art (SOTA) results when predicting band gap, volume, and is-gap-direct.

From: LLM-Prop: predicting the properties of crystalline materials using large language models

Fig. 8

a–c The results for band gap, volume, and Is-gap-direct, respectively. For instance, LLM-Prop achieves SOTA results on predicting band gap and volume with just about 90k training data (log1090k ≈ 4.95 on the figure) that corresponds to 35k fewer data than what baselines are trained on. We used log of the training data size on the x-axis for clarity. The performance of each model is calculated on validation set.

Back to article page