Table 2 Comprehensive review for recent crash severities prediction methods.

From: AI-based prediction of traffic crash severity for improving road safety and transportation efficiency

References

Year

Applied dataset

Dataset size

No. records

Applied algorithms

Results

7

2024

Highway 401 (Canada), U.S. highways

Large

 ~ 1 M records

Deep Neural Networks Adaptive Feature Selection

Deep Neural Networks: ~ 90% accuracy

8

2022

Traffic ]accident records

Moderate

 ~ 500 K records

Boruta Algorithm, RF, XGBoost, Naïve Bayes

XGBoost: 82.10%, Naïve Bayes: 79.52%

9

2022

Various disease risk datasets

Moderate

 ~ 100 K–500 K

Feature Selection Review

Feature selection review; no accuracy reported

10

2024

UAH-DriveSet dataset

Large

 ~ 1.2 M records

Wrapper FS, RF

RF: 96.4%, KNN: 96.29%

11

2023

Traffic accident data

Large

750 K accidents

DT, RF, LR

RF: 85% , DT: 80%

12

2025

Traffic data from Turkey

Small

13,234 accident records

KNN, RF, XGBoost, DNN, DNN + RF

Accuracy: 92%

13

2024

Real-time road data

Moderate

 ~ 600 K real-time traffic data

SMOTE, AdaBoost, XGBoost, RF

SMOTE + XGBoost: 88% AdaBoost: 85%

14

2020

Real-time vehicle and environmental data

Large

 ~ 850 K real-time traffic records

Bayesian Learners, KNN, SVM, MLP

Boosting Model: 93.66% F1-score

15

2022

NGSIM Traffic Data

Large

 ~ 900 K lane change instances

XGBoost, Recursive Feature Elimination

XGBoost with Feature Engineering: 97.6%

16

2023

Real-time vehicle telemetry

Large

 ~ 1 M vehicle telemetry records

LGBM, ENN-SMOTE-Tomek Link

LGBM with Feature Selection: 91.5%

17

2023

GIDAS (German In-Depth Accident Study)

Small

11,074 collision scenarios

K-Means +  + , k-NN

35 clusters of car-to-car collision configurations

18

2021

UCI Automobile Dataset

Moderate

 ~ 150 K automobile price records

LASSO Regression, Stepwise Selection

LASSO Regression (Testing): 87% accuracy

19

2022

Azure ML Studio dataset

Moderate

 ~ 300 K samples

Spearman Correlation, , Pearson Correlation

Fisher Score: Best among tested methods

20

2019

vehicles’ trajectory data

Small

822 candidate features

Feature Selection & Prediction Models

RF with SMOTETomek: 80.3%

21

2022

Accident severity datasets

Moderate

 ~ 650 K crash severity data

Gradient Boosting, Feature Engineering

Gradient Boosting: 89% LR: 85%

22

2023

New Zealand road accident data

Moderate

67,971 records

RF, AdaBoost, XGBoost, LGBM, CatBoost

RF with SMOTE: 81.45%

23

2023

Traffic accident data from the Qassim Province, Saudi Arabia

Small

3506 accidents

LR, RF, XGBoost

XGBoost: AUC 87%

RF: AUC 87%

LR: AUC 62%

24

2022

Traffic crash data from Al-Ahsa, Saudi Arabia

Small

9031 records

Binary Logistic RegressionRegression Tree Model

LR: 73%

CART Model: 74%

25

2022

Road traffic crash data from Highway 15 in Saudi Arabia

Small

3439 records

RF, KNN

RF: 78.7%

KNN: 75%

26

2025

Crash Report Sampling System

Moderate

45,373 records

Transformer architecture

Test accuracy: 93%