Fig. 1 | Scientific Reports

Fig. 1

From: Deep multi-task learning framework for gastrointestinal lesion-aided diagnosis and severity estimation

Fig. 1

The deep MTL framework of the proposed method. (a) Demonstrates MTL for GT lesion diagnosis and severity estimation in a unified manner by using the backbone network, ResNet50, and CViT for feature extraction \(\:{F}_{i}\). \(\:{F}_{i}\) fused to the next stage, and the weight is shared with each task to enhance model interpretability. (b) Fused output utilized by EMA to get attention heads for each task. This demonstrates convolutionally projected queries \(\:\left(Q,K,V\right)\). After applying SoftMax to normalize attention scores, the outputs of all heads are concatenated and transformed. The final multi-head output represents a combined representation that captures diverse patterns and relationships within image patches and tokens across multiple attention heads.

Back to article page