Fig. 1
From: GMM-searcher: efficient object search in large-scale scenes using large language models

The proposed GMM-Searcher framework consists of three main components. First, the perception module captures and processes panoramas and point clouds, providing comprehensive environmental data. These data serve two purposes: real-time observations for the LLM and storage within the adaptive-resolution topological graph with GMMs (ARTG-GMM). The LLM utilizes both real-time environmental observations and historical observations, applying its reasoning capabilities to formulate well-informed search strategies. For repeated tasks, the GMM stores past task experiences, progressively enhancing the LLM’s decision-making accuracy over time.