Fig. 3: Probability distributions of 12 WEPs elicited from GPT-3.5 and GPT-4 using Male, Female, and gender-neutral contexts.
From: An evaluation of estimative uncertainty in large language models

Low probability graphs have an x-axis range of 0–40, while others range from 40 to 100.