Table 8 Previous studies for PE classification. Here, AUC, CC, MIL, sen, spc, SSL, SPE-YOLO stand for area under the curve, conventional classification, multiple instance learning, sensitivity, specificity, self-supervised learning, and SE-Attention Prioritizes Features PE-You Only Look-Once respectively.

From: Improved pulmonary embolism detection in CT pulmonary angiogram scans with hybrid vision transformers and deep learning techniques

Study

Dataset

Architecture

Task Type

Findings

Condrea et al.15

RSNA

Dual-hup DNN with anatomical aware pretraining

PE detection from CT images

sen = 92.05%, spc = 96.17%

Ma et al.26

RSNA

Two phase multi task learning with interpretability(Grad-CAM, attention)

PE detection +localization +chronicity+ RV/LV ratio

AUROC = 0.93, sen = 86.02%

Khan et al.27

RSNA

DL framework based on DenseNet201 (feature extractor)+ customized fully connected layers

Multi-classification (PE classification into 9 classes)

Acc = 88.01%, sen = 88.00%, AUC = 0.90

Islam et al.28

CTPA dataset (specific dataset not mentioned)

Comparative study: CNNs vs ViTs; SSL vs supervised; transfer learning vs training from scratch; CC vs MIL

PE diagnosis(image-level and exam-level)

AUC = 0.96

Lynch et al.32

RSNA

PE-DeepNet: hybrid deep CNN with reduced parameters

PE classification

Acc = 94.21%

Suman et al.36

RSNA

Two stage attention-based CNN-LSTM network

PE detection+ type(chronic/acute)+ location(left/right/central)

AUC = 0.95

Wu et al.35

Tianjin internal(n=142) +RSNA test set(n=2000)

SPE-YOLO: YOLOv8 + P2 head+ SE-Attention+ ODconv for small PE detection

Small PE detection

sen = 90.71%, Acc = 86.45%

Mohammed et al.20

RSNA

EfficientNet-B7 + enhanced ViT with multi-task learning

Multi-task classification(PE detection, location, type)

AUC = 0.96

Cahan et al.37

Internal multimodal dataset(3D CTPA + clinical data)

Bilinear attention+ TabNet(structured + maging)

PE severity risk stratification (classification)

AUC = 0.96, sen = 90.00%, spc = 94.00%

The proposed method

RSNA

Ensemble approach (ResNet50 + DenseNet121 + Swin Transformer)

Binary classification

Acc = 97.80%, AUROC = 0.99