Table 2 Performance comparison on test set between LLM-Prop and a text/description-based baseline (MatBERT)

Model	Band gap	Volume	FEPA	EPA	Ehull	Is-gap-direct
	(eV) ↓	(A³/cell)˚↓	(eV/atom) ↓	(eV/atom) ↓	(eV/atom) ↓	(AUC) ↑
Text-based
MatBERT w/ Numbers	0.258	56.613	0.071	0.100	0.058	0.710
MatBERT w/o Numbers	0.262	54.969	0.079	0.104	0.053	0.714
MatBERT w/ [NUM]&[ANG]	0.260	55.984	0.076	0.098	0.050	0.722
LLM-Prop w/ Numbers	0.232	39.138	0.056	0.071	0.049	0.835
LLM-Prop w/o Numbers	0.231	39.252	0.056	0.072	0.047	0.839
LLM-Prop w/ [NUM]&[ANG]	0.234	40.123	0.057	0.067	0.047	0.857

“w/ Numbers” denotes retaining both bond lengths and angles, “w/o Numbers” denotes removing both bond lengths and bond angles from the crystal description, and “w/ [NUM]&[ANG]” means that we replace bond lengths and bond angles with [NUM] and [ANG] tokens, respectively.

Quick links

Search