Fig. 2: Women are represented as significantly younger than men in billions of words scraped from the internet, as encoded by the largest open-source model of OpenAI (GPT-2 Large).
From: Age and gender distortion in online media and large language models

Correlation between age and gender associations for 3,495 social categories in GPT-2 Large. The horizontal axis presents the gender association from 0 (female) to 1 (male), and the vertical axis presents the age association from 0 (young) to 1 (old). The trend line shows the linear prediction according to an ordinary least squares regression. The orange highlighted categories illustrate some of the categories that have the youngest and most female associations, whereas the blue highlighted categories illustrate some of the categories that have the oldest and most male associations.