Table 1 Summary of the studies.
From: Automated detection of corruption reports in text via deep reinforcement learning
References | Year | Research Goal | Method | Limitation/Key finding |
|---|---|---|---|---|
Lima et al.9 | 2020 | Predict corruption perception across countries | Machine Learning (Random Forest) | Country-level prediction; not text-based |
Liu12 | 2024 | Detect corrupt textual data | Unsupervised learning | Unsupervised text detection; not supervised classification |
Utomo et al.19 | 2022 | Predict Anti-Corruption Disclosure (ACD) from firm data | Deep Neural Network | ACD prediction from structured firm data; not raw text classification from reports |
Li et al.20 | 2020 | Identify self-reported corruption on Twitter | Unsupervised Machine Learning (Biterm topic model) | Unsupervised topic modeling of tweets; no multi-class classification of specific types |
Umer et al.21 | 2023 | Investigate FastText with CNNs for text classification | CNN with FastText embeddings | General text classification; highlights FastText + CNN efficacy |
Mohammed and Kora22 | 2022 | Develop effective ensemble for text classification | Ensemble Deep Learning (meta-classifier) | Ensemble improves accuracy but increases computational cost |
Ash et al.23 | 2021 | Analyze and support anti-corruption policy | Machine Learning (tree-based gradient-boosted classifier) | Corruption detection from administrative/structured data; not raw text classification |
Li24 | 2023 | Textual data mining for financial fraud detection | Deep Learning (Neural Network models) | Financial fraud in regulatory texts; not multi-class corruption in social media |
Muco26 | 2024 | Assess corruption from text data | NLP methods (using human-coded data) | Corruption assessment using human-coded text; specific method details unclear in review |
Chen et al.27 | 2022 | Automated legal text classification | Random Forest with domain concepts vs. Deep Learning | Feature engineering (Random Forest) outperformed DL in specific legal domain |
Dogra et al.28 | 2022 | Review state-of-the-art NLP models for text classification | Review paper (various ML/DL models) | Comprehensive survey of text classification methods |
Mittal et al.29 | 2021 | Multi-label text classification | Deep Graph-LSTM | Graph-based LSTM for multi-label text; not social media |
Köksal and Akgül30 | 2022 | Comparative study of deep learning for text classification | DNN, CNN, LSTM, GRU with hyperparameter tuning | Improvements with word embeddings and hyperparameter tuning in general text classification |
Soni et al.31 | 2023 | Develop CNN-based architecture for text classification | TextConvoNet (2D multi-scale CNN) | Novel 2D CNN (TextConvoNet) captures inter/intra-sentence features for general text classification |
Bangyal et al.32 | 2021 | Detect fake news text (COVID-19) | Various ML/DL (CNN, LSTM, RNN, GRU) with TF–IDF | Fake news detection in microblogs; applies various DL models |
Xiong et al.33 | 2023 | Extreme multi-label text classification (XMTC) | XRR (Retrieving and Deep Ranking with Transformers) | Two-stage transformer for extreme multi-label classification |
Abarna et al.34 | 2022 | Idiom/literal text classification | Ensemble K-BERT (with knowledge graphs) | Advanced semantic classification using knowledge-enhanced BERT |