Fig. 4

An Attention Block with Query Q, and Input X, Results in a Weighted Sum. K is the Key which influence The Weights with Query Q’s Similarity.

An Attention Block with Query Q, and Input X, Results in a Weighted Sum. K is the Key which influence The Weights with Query Q’s Similarity.