Fig. 2: Statistical overview of the Q2CRBench-3 benchmark dataset.
From: Streamlining evidence based clinical recommendations with large language models

a Data distribution across the three datasets of Q2CRBench-3. b Total number of literature records retrieved for all clinical questions and the proportion excluded after screening. The retrieval results represent a subset of the original search data. Notably, 99.49% of retrieved records were excluded, highlighting the substantial burden of evidence screening in clinical decision support.