Scientific Reports

Table 1 Comparison of existing approaches and proposed model for image forgery detection.

From: Multi-resolution transfer learning for tampered image classification using SE-enhanced fused-MBConv and optimized CNN heads

Drawback addressed	Existing approaches	Proposed model
Localization	Approaches like LBP + DCT (e.g.,³) achieve high accuracy but lack localization and computational efficiency	EfficientNetV2B0 backbone with SE-attention and Fused MBConv enhances feature extraction for better localization
Adaptability to unseen forgeries	Gabor wavelets, LPQ with NMF (e.g.,⁴) show strong rotation and scale invariance but are limited by handcrafted texture descriptors	Deep learning-based approach with transfer learning and Focal Loss, improving adaptability to unseen forgeries
Deep semantic understanding	Statistical methods like GLCM and BDCT (e.g.,⁵) perform well under noise but lack deep semantic understanding	The model integrates deep learning features, enhancing semantic understanding for forgery detection
Adversarial robustness	Hybrid systems like CNN-DWT (e.g.,⁸) lack robustness against adversarial attacks and localization	The proposed model is robust to adversarial attacks, enhancing generalization and localization across datasets
Real-time applicability	Vision Transformer (ViT) with SAM (e.g.,¹¹) faces high computational overhead, limiting real-time applicability	The proposed model is optimized for real-time detection with minimal latency using lightweight architecture
Robustness under compression	ResNet-50 with multi-scale loss (e.g.,¹²) improves boundary attention but struggles under compression	EfficientNetV2B0 backbone with SE-attention and Focal Loss ensures robustness under compression and diverse manipulations
Generalization across datasets	Methods like CLAHE-boosted CNN + SVM (e.g.,⁹) struggle with cross-dataset generalization	The model outperforms 42 state-of-the-art methods, showing high generalization across diverse datasets and manipulation types
Complexity and latency	Complex models like ResNet-ViT hybrid (e.g.,¹⁹) are computationally expensive with high latency	The proposed model reduces complexity with a lightweight architecture while achieving high performance for real-time use
Handling subtle manipulations	MAC-Net (e.g.,¹⁸) is limited to splicing cases and struggles with subtle manipulations	The proposed model excels in detecting subtle manipulations using SE-attention and multi-resolution feature extraction
Rotation and scale invariance	Gabor wavelets + LPQ (e.g.,⁴) provide rotation and scale invariance but are limited by handcrafted features	The proposed model’s deep learning approach allows better feature extraction, improving performance on rotated and scaled images
Boundary awareness	ME-Net (e.g.,¹³) enhances edge localization but introduces complexity	EfficientNetV2B0 with SE-attention blocks improves boundary detection without additional complexity
Cross-resolution generalization	Models like ConvNeXtFF (e.g.,¹⁵) struggle with fixed input resolution and high resource demands	The proposed model generalizes across multiple image resolutions and manipulation types effectively
Handling noisy data	CSR-Net (e.g.,¹⁷) introduces spline-based regression but is sensitive to thresholds and curve-fitting	The model handles noisy data effectively with the integration of SE-attention and Focal Loss for robust learning
Generalization to various manipulation types	LBRT (e.g.,¹⁶) is limited to copy-move detection and struggles with subtle manipulations	The proposed model handles multiple forgery types (splicing, copy-move, hybrid) and performs well under varying conditions

Back to article page

Search

Advanced search

Quick links