Table 2 Statistics of hallucinated papers in the computer science and biomedicine domains
From: Synthesizing scientific literature with retrieval-augmented language models
Computer science | Biomedicine | |||||
|---|---|---|---|---|---|---|
Model | Total no. | No. of hallucinated (↓) | Ratio (↓) | Total no. | No. of hallucinated (↓) | Ratio (↓) |
OpenScholar-8B | 9.65 | 0 | 0 | 6.25 | 0 | 0 |
Llama 3.1 8B | 5.20 | 4.79 | 92.1% | 5.58 | 5.46 | 97.6% |
Llama 3.1 70B | 6.14 | 4.78 | 78.1% | 6.98 | 6.74 | 96.6% |
GPT-4o | 5.74 | 4.52 | 78.7% | 5.24 | 4.97 | 94.8% |