Fig. 3: The results of question decomposition.
From: Streamlining evidence based clinical recommendations with large language models

a Workflow diagram of question decomposition. The LLM needs to extract PICO components from the problem, and each component may contain several concepts. Outcome is not present in questions of 2021 ACR RA. b Performance of different methods on the question decomposition task. The left panel shows F1 scores, and the right panel shows BERTScores; error bars represent 95% confidence intervals. Methods incorporating examples outperform those without, with the largest improvement observed in the Comparison component. c Example outputs from different decomposition approaches. Population concepts are highlighted in pink, Intervention in blue, and Comparison in green. The self-reflection approach yields more fine-grained Comparison concepts, enabling clearer and more actionable search terms for users.