Table 2 Named entity recognition and relation extraction scores for three tasks in materials science using models with a JSON output schema

Task	Relation	E.M. Precision (GPT-3)	E.M. Recall (GPT-3)	E.M. F₁ (GPT-3)	E.M. Precision (Llama-2)	E.M. Recall (Llama-2)	E.M. F₁ (Llama-2)
Doping	host-dopant	0.772	0.684	0.726	0.836	0.807	0.821^a
General	formula-name	0.507	0.429	0.456	0.462	0.417	0.367
General	formula-acronym	0.500	0.250	0.333	0.333	0.250	0.286
General	formula-structure/phase	0.538	0.439	0.482	0.551	0.432	0.47
General	formula-application	0.542	0.543	0.537	0.545	0.496	0.516
General	formula-description	0.362	0.35	0.354	0.347	0.342	0.340
MOFs	name-formula	0.425	0.688	0.483	0.460	0.454	0.276
MOFs	name-guest specie	0.789	0.576	0.616	0.497	0.407	0.408
MOFs	name-application	0.657	0.518	0.573	0.507	0.562	0.531
MOFs	name-description	0.493	0.475	0.404	0.432	0.411	0.389

Exact match (E.M.) scores are evaluated on a per-word basis, and links are only correct if both entities and the relationship are correct. The exact match metric scores output that contains the correct information but is written differently as incorrect, making such scores a rough lower bound on the true performance of models. F₁, precision, and recall reflect the scores on a hold out test set for doping models and averages over five cross-validation sets for the general and MOF models.
^aBest F₁ scores for each task are shown in bold.

Quick links

Search