Fig. 21: Distribution of Stage 2 responses per model when prompted in French.
From: Large language models reflect the ideology of their creators

left Label distributions of valid responses. right validity rates. A response is invalid if the Stage 1 response is a refusal or clear hallucination, or if the Stage 2 response cannot clearly be mapped to the answer scale.