Fig. 1: Teacher-to-student communication model using a continuous compression of task solutions to low-dimensional message vectors. | Nature Communications

Fig. 1: Teacher-to-student communication model using a continuous compression of task solutions to low-dimensional message vectors.

From: A framework for the emergence and analysis of language in social learning agents

Fig. 1

a Model sketch depicting a generalist student agent that is provided messages from teacher agents for various tasks. The student learns to decode these messages and then perform the relevant tasks. b (Top) Representative navigation tasks used to train and test agents to analyze the social learning framework. Beginning in the bottom left corner, the agents aim to reach the goal (trophy) in as few steps as possible while avoiding the walls (light blue squares). (Bottom) Overlaid are example policies for tasks learned by the teacher agents. The student needs to decode the encoded version of this information it receives. Messages (mi) may contain erroneous instructions or be misunderstood by the student (red squares). c Detailed communication architectures used in this study. In each of the three approaches, task information (Q-matrices in our framework) is first learned by teacher agents, who then pass this information through a sparse autoencoder (language proxy), which generates the associated low-dimensional representations, mi. When student feedback is absent (top row), these representations mi are provided directly to the student who learns to interpret them to solve task i. In the case of student feedback (middle row), we also allow feedback from the student performance to propagate back to the language training and enhance the usefulness of the messages. The final schematic (bottom row) depicts the “closing-the-loop" architecture. Here, the student is trained on a set of messages from expert teachers. Once it is sufficiently competent, its task information is supplied to itself (after being passed through the language embedding trained with feedback), and the effect on performance is studied.

Back to article page