Table 19 The comprehensive overview of contributions in existing literature compared to our study.

From: A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers

Paper

Year

Methodology

Datasets Used

Mean F1-score

Key Contributions

FastLogAD

77

2024

Sequence-based unimodal unsupervised learning

1) HDFS\(^{1}\)12 2) BlueGene/L 3) Thunderbird132 

94.18

1) Efficient utilization of normal data.

2) Innovative anomaly generation method.

pylogsentiment

38

2020

Semantic-based unimodal supervised learning

1) Spark 2) Honey5 3) Windows 4) Casper 5) Jhuisi 6) Nssal 7) Honey7 8) Zookeeper 9) Hadoop 10) BlueGene/L

99.14

1) Implements sentiment analysis for log anomaly detection.

2) Addressing class imbalance.

RAGLog

92

2024

LLM-based unimodal unsupervised learning

1) BlueGene/L 2) Thunderbird

89.00

1) A retrieval-augmented generation model tdat employs a vector database to store normal log entries.

UMFLog

84

2023

Independent network-based ultimodal unsupervised learning

1) HDFS 2) BlueGene/L 3) Thunderbird

99.56

1) Employs a dual-model architecture with BERT for semantic feature extraction and VAE for statistical feature analysis.

2) Handles long sequences of log data.

LogMS

59

2024

Early fusion-based multimodal unsupervised & semi-supervised learning

1) HDFS 2) BlueGene/L

99.10

1) Employs a two-step model. The first step uses a multi-source information fusion-based LSTM to detect anomalies by utilizing semantic, sequential, and quantitative data. Following that, a probability label estimation-based GRU network is used.

MDFULog

45

2023

Intermediate fusion-based multimodal supervised learning

1) HDFS 2) OpenStack132

97.00

1) Addresses noise in log data.

2) Informer-based anomaly detection.

MFF

44

2023

Late fusion-based multimodal supervised learning

1) LLSD244 

93.10

1) Detects web scanning behavior by considering HTTP textual content, status code, and frequency features.

2) Employ late fusion MFF-based network to detect anomalies.

CoLog

2025

Intermediate fusion-based multimodal supervised learning

1) Spark 2) Honey5 3) Windows 4) Casper 5) Jhuisi 6) Nssal 7) Honey7 8) Zookeeper 9) Hadoop 10) BlueGene/L

99.99

1) Encodes log records collaboratively according to various log modalities.

2) Employs MAL to address heterogeneity among modalities.

3) Outperforms state-of-the-art methods.

4) Detects point and collective abnormalities within a unified framework.

  1. \(^1\)Hadoop Distributed File System.