Fig. 1: Multi-step architecture of the radiology Retrieval and Reasoning (RaR) framework for radiology question answering.
From: Multi-step retrieval and reasoning improves radiology question answering with large language models

The pipeline combines structured retrieval with multi-step reasoning to generate evidence-grounded diagnostic reports. (1) Each question is preprocessed to extract key diagnostic concepts (using Mistral Large) and paired with multiple-choice options. (2) A supervisor module creates a structured research plan, delegating each diagnostic option to a dedicated research module. (3) Research modules iteratively retrieve targeted evidence from www.radiopaedia.org via a SearXNG-powered search tool, refining queries when needed. (4) Retrieved content is synthesized into structured report sections (using GPT-4o-mini and formatting tools), including supporting and contradicting evidence with citations. (5) The supervisor compiles all sections into a final diagnostic report (introduction, analysis, and conclusion), which is appended to the prompt for final answer selection. The entire workflow is coordinated through a stateful directed graph that preserves shared memory, retrieved context, and intermediate drafts.