Fig. 2: Comparative performance of different LLM models on medical documentation tasks. | npj Digital Medicine

Fig. 2: Comparative performance of different LLM models on medical documentation tasks.

From: Enhancing clinical documentation with voice processing and large language models: a study on the LAOS system

Fig. 2: Comparative performance of different LLM models on medical documentation tasks.

Automatic evaluation metrics comparing different model variants on cataract surgery documentation. Results are shown for three key document types: Admission Report, Surgery Record, and Discharge Summary. The evaluation includes ChatGLM2-6B46, Baichuan-13B36, Qwen-220, Baichuan-13B-SFT, and Qwen2-7B-SFT models. Across all metrics (BERTScore43, ROUGE-L42, and BLEU41), the models show varying performance in generating accurate and relevant medical documentation.

Back to article page