Table 5 IE results from the three tests for all the different models.

From: Annotated textual dataset PV600 of perovskite bandgaps for information extraction from literature

 

Metric

Mixtral

Mixtral-Instruct

Llama-3.1

Llama3-ChatQA

GPT-4o

CDE2

QA-MatSciBERT

T1

P

23.4

67.1

23.8

44.0

81.7 (± 0.2)

R

41.0

43.2

54.6

67.8

81.1 (± 0.4)

F1

29.8

52.6

33.2

53.4

81.40.3)

T2

P

71.2

79.9

77.8

77.7

81.7 (± 0.2)

87.0

87.5

R

59.9

71.8

60.3

69.2

81.1 (± 0.4)

29.5

61.7

F1

65.1

75.6

68.0

73.2

81.40.3)

44.1

72.3

T3

P

23.0

32.8

25.6

41.0

65.6 (± 0.6)

81.6

65.0

R

63.9

75.3

61.2

71.4

83.1 (± 1.3)

31.3

63.0

F1

33.8

45.7

36.0

52.1

73.30.9)

45.2

64.0

  1. T1 stands for Test 1 (preselection with the same model), T2 for Test 2 (preselection with GPT-4o) and T3 for Test 3 (IE without preselection). The GPT-4o results are presented as average  ± standard deviation.