Extended Data Fig. 1: Distributions of (a) IPIP-NEO and (b) BFI personality domain scores across models. | Nature Machine Intelligence

Extended Data Fig. 1: Distributions of (a) IPIP-NEO and (b) BFI personality domain scores across models.

From: A psychometric framework for evaluating and shaping personality traits in large language models

Extended Data Fig. 1: Distributions of (a) IPIP-NEO and (b) BFI personality domain scores across models.

Box plots depict model medians surrounded by their interquartile ranges and outlier values. As models increased in size (for example, Flan-PaLM from 8B to 540B parameters), (a) IPIP-NEO scores were relatively more stable compared to (b) BFI scores, where scores for socially-desirable traits increased while NEU scores decreased. n = 1, 250 observations per model, per test; N = 22, 500 total observations per test.

Back to article page