Fig. 1
From: A framework for robotic manipulation tasks based on multiple zero shot models

A robotic task is executed by invoking multiple modules within the “Panda Act” framework. The LLM in “Panda Act” autonomously selects which framework modules to call based on task instructions. The green modules represent the zero-shot models currently included in the framework (Text-Davinci-00339, Llama340, GPT-441, SAM42, HQ-SAM43, Mobile-SAM44, CLIP27, Open-CLIP45, ImageBind46).