Fig. 2: Relative change (Δ) in irritability scores following exposure to irritation-inducing prompts across three validated instruments: the Brief Irritability Test (BITe), the Irritability Questionnaire (IRQ), and the Caprara Irritability Scale (CIS). | npj Digital Medicine

Fig. 2: Relative change (Δ) in irritability scores following exposure to irritation-inducing prompts across three validated instruments: the Brief Irritability Test (BITe), the Irritability Questionnaire (IRQ), and the Caprara Irritability Scale (CIS).

From: Assessing the impact of safety guardrails on large language models using irritability metrics

Search

Advanced search

Search

Quick links