Table 19 The comprehensive overview of contributions in existing literature compared to our study.
Paper | Year | Methodology | Datasets Used | Mean F1-score | Key Contributions |
|---|---|---|---|---|---|
FastLogAD | 2024 | Sequence-based unimodal unsupervised learning | 94.18 | 1) Efficient utilization of normal data. | |
2) Innovative anomaly generation method. | |||||
pylogsentiment | 2020 | Semantic-based unimodal supervised learning | 1) Spark 2) Honey5 3) Windows 4) Casper 5) Jhuisi 6) Nssal 7) Honey7 8) Zookeeper 9) Hadoop 10) BlueGene/L | 99.14 | 1) Implements sentiment analysis for log anomaly detection. |
2) Addressing class imbalance. | |||||
RAGLog | 2024 | LLM-based unimodal unsupervised learning | 1) BlueGene/LÂ 2) Thunderbird | 89.00 | 1) A retrieval-augmented generation model tdat employs a vector database to store normal log entries. |
UMFLog | 2023 | Independent network-based ultimodal unsupervised learning | 1) HDFSÂ 2) BlueGene/LÂ 3) Thunderbird | 99.56 | 1) Employs a dual-model architecture with BERT for semantic feature extraction and VAE for statistical feature analysis. |
2) Handles long sequences of log data. | |||||
LogMS | 2024 | Early fusion-based multimodal unsupervised & semi-supervised learning | 1) HDFSÂ 2) BlueGene/L | 99.10 | 1) Employs a two-step model. The first step uses a multi-source information fusion-based LSTM to detect anomalies by utilizing semantic, sequential, and quantitative data. Following that, a probability label estimation-based GRU network is used. |
MDFULog | 2023 | Intermediate fusion-based multimodal supervised learning | 1) HDFSÂ 2) OpenStack132 | 97.00 | 1) Addresses noise in log data. |
2) Informer-based anomaly detection. | |||||
MFF | 2023 | Late fusion-based multimodal supervised learning | 1) LLSD244Â | 93.10 | 1) Detects web scanning behavior by considering HTTP textual content, status code, and frequency features. |
2) Employ late fusion MFF-based network to detect anomalies. | |||||
CoLog | 2025 | Intermediate fusion-based multimodal supervised learning | 1) Spark 2) Honey5 3) Windows 4) Casper 5) Jhuisi 6) Nssal 7) Honey7 8) Zookeeper 9) Hadoop 10) BlueGene/L | 99.99 | 1) Encodes log records collaboratively according to various log modalities. |
2) Employs MAL to address heterogeneity among modalities. | |||||
3) Outperforms state-of-the-art methods. | |||||
4) Detects point and collective abnormalities within a unified framework. |