Fig. 3: K‑means clustering of official U.S. government documents using BERT embeddings.

Scatterplot of document representations encoded with BERT, reduced to two dimensions via PCA, and clustered with the K‑means algorithm. Each point represents a single document, colored by its assigned cluster.