Fig. 2 | Scientific Reports

Fig. 2

From: WGAN-based multi-structure segmentation of vertebral cross-section MRI using ResU-Net and clustered transformer

Fig. 2

The overall structure of the network is primarily composed of two parts: the generator and the discriminator. Since conventional GAN network generators often produce less effective predictions, we use a U-Net-based backbone network to predict and segment three types of structures: vertebral bodies, vertebral foramina, and laminae. By incorporating dilated convolutions and a cluster-based Transformer module (CTM), we further enhance boundary-capturing ability and improve pixel-level segmentation accuracy in the foreground. The role of the CTM module is to reduce the computational complexity of the attention mechanism by clustering feature vectors in the Query matrix, allowing the model to improve computational efficiency while maintaining accuracy. The purpose of the segmentation network is to boost the accuracy of the predicted labels by mixing the generator’s predictions with true labels to deceive the discriminator. This setup enables the generator to improve its performance through feedback from the discriminator.

Back to article page