Fig. 1: The multi-agent LLM system underlying CASSIA.
From: CASSIA: a multi-agent large language model for automated and interpretable cell annotation

a A user interacts directly with the Onboarding platform by specifying species, tissue type, and a collection of markers associated with cell subtypes within that tissue, if known. Any information associated with experimental conditions, interventions, or other sample specific information may be provided. Created in BioRender. Shireman, J. (2025) https://BioRender.com/e55de3z. b Together, this input is used to create the user prompt given to the Annotator agent. The Annotator agent performs a comprehensive annotation of the single-cell data using a zero-shot chain-of-thought approach that mimics the standard workflow that a computational biologist would typically follow for cell annotation. Results are then passed to the Validator agent to check marker and cell type consistency; results failing validation are passed back to the Annotator and this iterative process continues until results pass validation (or the maximum number of iterations is reached). Results are then moved to the Formatting agent, which summarizes each cell annotation; this summary along with the full conversation history is provided to the Scoring agent for quality scoring. The Reporter agent then generates a comprehensive report documenting the complete annotation process, including agent conversations, quality evaluation reasoning, and validation decisions with supporting evidence to facilitate transparent interpretation of results. The output from default CASSIA is shown in (c). Optional agents include those shown in (d). Source data are provided as a Source Data file. Created in BioRender. Shireman, J. (2025) https://BioRender.com/9gzyoe5.