Table 6 Manual performance evaluation for LLMS before fine-tuning.

	GPT-4	Yi	InternLM2	Mixtral
Aim	1.0	0.71	0.89	0.55
Motivation	1.0	0.65	0.90	0.61
Methods	0.97	0.68	0.88	0.59
Question addressed	0.98	0.73	0.81	0.68
Evaluation metrics	1.0	0.55	0.65	0.42
Result	0.97	0.70	0.91	0.65
Limitations	0.90	0.51	0.61	0.62
Contribution	1.0	0.72	0.85	0.75
Future work	0.92	0.68	0.77	0.56
Average	0.97	0.65	0.80	0.60

Quick links

Search