Fig. 2: Dataset statistics of MedThink-Bench.
From: Automating expert-level medical reasoning evaluation of large language models

a Breakdown of the 10 medical domains included in the MedThink-Bench dataset. b Detailed statistics of the dataset.
From: Automating expert-level medical reasoning evaluation of large language models

a Breakdown of the 10 medical domains included in the MedThink-Bench dataset. b Detailed statistics of the dataset.