Fig. 4: Robustness analysis of DrugGPT. | Nature Biomedical Engineering

Fig. 4: Robustness analysis of DrugGPT.

From: A collaborative large language model for drug analysis

Fig. 4

The performance of different methods with respect to various sizes of the knowledge bases used. We report F1 scores for the ChatDoctor dataset and accuracy values for other datasets. In detail, we conduct the evaluations with different sizes of used knowledge bases, from 1% to 100%. DrugGPT consistently outperforms ChatGPT in all cases. With fewer parameters, our method achieves performance similar to GPT-4 when using only a small fraction (that is, 10%) of the available knowledge bases in ADE, Drug-Effects and DDI. Note also that, the more knowledge is used, the better DrugGPT performs compared with baseline models, resulting in its outperforming GPT-4 when using full knowledge bases.

Source data

Back to article page