Table 1 Comparison of different algorithms for KFE.

From: Key frame extraction algorithm for surveillance videos using an evolutionary approach

Algorithm

Architecture/methodology

Limitations

DL

DL such as Capsule Net, YOLO, LSTM, Deep Summarization Network (DSN), Fully Connected Sequence Network, Attention-based Encoder, Decoder Network, RNN, Chunk and Stride Net (CSNet), Summary Net, Detect to Summarize Net (DSNet) were investigated

These models had been evaluated exclusively on specific datasets

They are computationally expensive as they require powerful GPUs and a significant amount of training time

The DL requires fine-tuning many hyperparameters, such as learning rate, model size, and regularization techniques. This process can be challenging and frequently involves much trial and error, unlike GA’s, which are self-adaptive

Image Processing and Machine Learning (ML)

K-Means Clustering With DCT, Sparse Dictionary Selection, Semantic Graphs, Odometry Estimation By Scan Line Similarity, SIFT, Color SIFT, HWVP Descriptors, GIST, HSV, PHOG Descriptors With K-Means, SSPA Curve, LBP, Optical Flow

They are designed for specific applications

Model and hyperparameters limit the ML. Image processing is task-specific and non-adaptable. Unlike GA’s that are highly adaptive and evolve dynamically

Evolutionary Approaches

Application of evolutionary algorithms for Image segmentation7, Image fusion8, and for solving COP’s9,14 GA-based KFE

The FF involved in GA for KFE is just based on color distance

It lacks generality and does not apply to all types of videos

There is no effective survivor selection mechanism