Fig. 4

WeedSwin: A hierarchical vision transformer architecture incorporating progressive attention heads (6-48), feature enhancement through Channel Mapper, eight decoder blocks with cross-attention mechanisms, and Feature Pyramid Network for precise weed detection and localization.