Fig. 6: Multi-branch transformer architecture.
From: NeuralDEM for real time simulations of industrial particular flows

Schematic of a multi-branch transformer architecture. DiT48 modulation is applied to each attention and MLP block but is omitted for visual clarity.