Scientific Data

Table 4 Preselection evaluation results precision (P), recall (R) and F1-score (F1) of LLMs.

From: Annotated textual dataset PV600 of perovskite bandgaps for information extraction from literature

	Mixtral	Mixtral-Instruct	Llama-3.1	Llama3-ChatQA	GPT-4o
P	33.7	74.3	32.8	35.8	87.8 (±0.6)
R	66.0	55.3	92.0	97.3	94.7 (±1.8)
F1	44.6	63.4	48.3	52.4	91.6 (±0.2)

The GPT-4o results are presented as average ± standard deviation.

Back to article page

Search

Advanced search

Quick links