Extended Data Fig. 2: Convergent Pearson’s correlations (rs) between IPIP-NEO and BFI scores by model.
From: A psychometric framework for evaluating and shaping personality traits in large language models

Heatmap illustrates the similarities (convergence) in IPIP-NEO and BFI score variation for each Big Five domain; the last row represents average correlations across all personality dimensions for a model. Stronger positive correlations (blue) indicate higher levels of convergence and provide evidence for convergent validity. EXT = extraversion; AGR = agreeableness; CON = conscientiousness; NEU = neuroticism; OPE = openness. N = 22, 500 total paired test observations. All correlations are statistically significant at p < 0.0001 (two-sided values computed using Student’s t-distribution, based on n = 1, 250 observations per model, per domain).