Table 6 Response stability of different large language models.

From: Leveraging large language models and embedding representations for enhanced word similarity computation

Model

Proportion of effective responses

Bloom-7B1

98.6%

Qwen-7B-Chat-Int4

99.0%

Qwen-7B-Chat

99.1%

Deepseek-7B

99.6%

ChatGPT-3.5-turb

99.7%

ChatGPT-4

99.8%