Nature Search

Comparative benchmarking of the DeepSeek large language model on medical tasks and clinical reasoning

The open-source DeepSeek large language model showed variable performance relative to two leading models when benchmarked on four different medical tasks, with relatively strong reasoning capabilities but similar or weaker relative performance on other tasks, such as summarization of imaging reports.

Mickael Tordjman
Zelong Liu
Xueyan Mei

Research23 Apr 2025

Nature Medicine

Volume: 31, P: 2550-2555

Quick links

Search

Filter By:

Comparative benchmarking of the DeepSeek large language model on medical tasks and clinical reasoning

Search

Quick links