Fig. 3
From: Enhancing object pose estimation for RGB images in cluttered scenes

Multi-head self-attention comprises multiple self-attention units. Each unit contains a unique set of learning weights and is computed separately.
From: Enhancing object pose estimation for RGB images in cluttered scenes

Multi-head self-attention comprises multiple self-attention units. Each unit contains a unique set of learning weights and is computed separately.