Fig. 4: Workflow for data curation, LLM validation, and EVCS problem categorization.
From: Equity and reliability of public electric vehicle charging stations in the United States

We collect over 470,000 EV charging station user reviews and apply a large language model (LLM) to determine sentiment across all reviews. A random subset of 3188 reviews is annotated by humans to validate sentiment classification. Among these, 1081 reviews are identified as negative and further categorized using grounded theory and human-labeled data to identify common problem types. LLM is then used to categorize all 128,271 negative reviews, and model performance is assessed against human-labeled categories.