Table 1 Summary of recent attention-based histopathology models compared to the proposed HistoDARE.

From: A histopathology aware DINO model with attention based representation enhancement

Model

Backbone

Attention type

Key remarks/limitations

HistoSSL18

Multi-branch SSL

Multi-level fusion

Reduces annotation need; lacks spatial explainability

MMAE19

MAE + H&E/RGB

Dual-modality

Enhances morphology; requires stain alignment

CycleGAN-SSL20

GAN-based SSL

Cycle-consistent

Improves domain transfer; high training cost

TransFuse21

CNN + Transformer

Parallel fusion

Strong accuracy; dual-branch inference overhead

DS-TransUNet22

Swin Transformer

Dual-scope

Good context modeling; high memory need

EG-TransUNet23

Swin + Attention

Enhancement module

Improved mDice; limited validation

FCB-SwinV224

SwinV2 hybrid

Channel fusion

Very deep; heavy computation

SAG-ViT25

EfficientNet + GAT

Graph attention

High F1; multi-stage and costly

BioSAM-2/SAM-226,27

SAM/ViT

Prompt-based

Zero-shot; requires user interaction

Bilal & Asif (2025)17

CNN + SA

Lightweight fusion

Efficient; dataset-specific

Hekmat (2025)16

CNN + Transformer

Sequential attention

More interpretable; higher complexity

HistoDARE (Ours)

DINOv2 (ViT-L/14)

Hierarchical (Spatial + Channel + Residual)

Unified design; interpretable and efficient