Exploring the injury severity of unlicensed powered two- and three-wheeler drivers in two-vehicle crashes in China

Xu, Peixiang; Wei, Fulu; Guo, Dong; Guo, Yongqing; Sun, Lizu; Liu, Chuan; Zhou, Bin

doi:10.1038/s41598-025-88896-3

Download PDF

Article
Open access
Published: 06 April 2025

Exploring the injury severity of unlicensed powered two- and three-wheeler drivers in two-vehicle crashes in China

Peixiang Xu¹^na1,
Fulu Wei¹^na1,
Dong Guo¹,
Yongqing Guo¹,
Lizu Sun¹,
Chuan Liu¹ &
…
Bin Zhou²

Scientific Reports volume 15, Article number: 11802 (2025) Cite this article

1369 Accesses
Metrics details

Subjects

Abstract

Large presence of unlicensed powered two- and three-wheeler (PTW) drivers in China pose a significant threat to road safety. In this study, a customized Deep Forest Model (DF-ptw) is constructed to investigate the effect of unlicensed PTW drivers on crash severity in two-vehicle crashes, using a recent 3-year historical crash data. SHapley Additive explanation (SHAP) and Partial Dependence Plot (PDP) analysis reveal that unlicensed motorcyclists are significantly more likely to suffer serious injuries in two-vehicle crashes compared to unlicensed auto-rickshaw drivers. Additionally, factors such as drunk driving, fatigued driving, and being an unlicensed driver over the age of 53 notably elevate the risk of serious injury or death, with unlicensed motorcyclists being disproportionately affected. Moreover, self-employed unlicensed PTW drivers face a higher probability of serious injury or fatality in crashes compared to farmers, blue-collar, and white-collar workers. Unlicensed PTW drivers are also more susceptible to severe or fatal injuries on national and provincial roads, in low visibility conditions, during late-night hours, on non-separated roads, and at dusk or dawn. Based on these findings, this study proposes to reduce the frequency and severity of crashes involving unlicensed PTW drivers by focusing on more stringent eligibility checks, increasing safety awareness, and implementing advanced safety measures.

Assessing heterogeneity in factors influencing three-wheeled motorized rickshaws crash outcomes between weekdays and weekends

Article Open access 23 April 2025

Unveiling the risks of speeding behavior by investigating the dynamics of driver injury severity through advanced analytics

Article Open access 28 September 2024

Factors influencing injury severity in three-wheeled motorized rickshaw and motorcycle collisions

Article Open access 26 May 2025

Introduction

Background

Powered two- and three-wheelers (PTWs), such as motorcycles, and auto-rickshaws have high accessibility, which is important for transportation, complementing, competing with, and supplementing other modes especially in developing countries¹. However, PTWs often mix with four-wheeled vehicles in traffic and are more prone to loss of control, increasing their likelihood of being involved in serious two-vehicle crashes². The percentage of fatalities for PTWs is 27% in China, which means almost 70,000 people died in PTW-involved crashes per year³. Additionally, there are many unlicensed PTW drivers in China. The lack of professional driver training significantly increases road safety risks.

In China, a highly permissive vehicle market enables buyers to acquire PTWs without possessing a driver’s license. The abundance of second-hand PTWs with existing license plates in the unregulated market al.lows individuals without licenses to freely drive on roads without intervention from traffic police⁴. Additionally, drivers without specialized training are often unfamiliar with traffic laws, making them more likely to be involved in two- or multiple-vehicle collisions, especially at intersections and when other drivers are not paying attention to them^5,6. Therefore, it is necessary to explore the impact of unlicensed PTW drivers on crash severity in two-vehicle crashes.

A brief review of past studies

This section provides a brief review of prior research on factors affecting the severity of two-vehicle crashes involving various PTW models. It then shifts focus to the impact of driving PTWs without a license on crash outcomes and discusses accident analysis methods pertinent to this study.

PTW drivers tend to suffer more severe injuries in crashes than four-wheelers. Consequently, researchers have conducted numerous studies over the years on the risks and severity of crashes affecting PTW safety. Previous studies conducted in various regions have demonstrated that several factors significantly impact the severity of PTW driver injuries, including PTW type, age, gender, roadway type, speed limits, time of day, and natural environment^7,8. However, differences between PTW models can lead to variations in how the same influencing factor affects crash severity.

For motorcyclists, older riders (age > 65), alcohol involved, or not wearing a helmet significantly affect serious injury outcomes^9,10. In two-vehicle crashes, the crash partner also influences the crash severity. Motorcycles are more likely to be involved in serious crashes with trucks due to the large number of blind spots in the truck’s field of vision¹¹. The location (e.g., intersection) and type of crash (e.g., rear-end) also contribute to the severity of two-vehicle crashes^12,13. In addition, several external environmental factors can significantly impact the outcome of serious injuries, such as slippery road surfaces, lack of lighting at night, and poor visibility¹⁴.

Auto-rickshaws involved in two-vehicle crash differ somewhat from motorcycles and mopeds. Auto-rickshaws have a lower risk of serious crashes in two-vehicle crashes compared with motorcycles. The enclosure provides safety to its occupants in case of a crash with a car or a heavier vehicle¹⁵. Furthermore, most auto-rickshaws do not have seat belts, and in a crash, the driver is often thrown out from the vehicle, resulting in head injuries¹⁶. Additionally, a study from Pakistani identified several factors that exacerbate the severity of crashes involving auto-rickshaws. These factors include driving during the daytime, weekdays, off-peak periods, and under clear weather conditions⁸.

Regardless of the type of PTW, studies have shown that unlicensed driving exacerbates the risk of crashes, particularly in developing countries. These groups are more likely to include farmers in suburban or rural areas. Factors such as economic conditions, accessibility and availability of driver licensing and training, and age limitations lead these groups to drive without a license¹⁷. Dangerous driving behaviors, such as driving under the influence of alcohol, driving at night, speeding, and running red lights, often accompany unlicensed driving¹⁸. However, most current research has focused solely on unlicensed driving as a factor affecting crash severity, lacking in-depth research on this group.

In addition, in recent years the advancement of Machine learning (ML) algorithms has led to their increased adoption in traffic safety research¹⁹. ML is more frequently used as a prediction tool than traditional statistical methods. However, the two approaches share commonalities. Both ML and statistical methods aim to improve forecasting accuracy by minimizing a loss function²⁰. ML methods are also more computationally demanding, relying heavily on computer science, which places them at the intersection of statistics and computer science²¹. Researchers favor a type of tree model with a branching structure based on feature space partitioning. This model is preferred because it does not require specific measurements, comprises multiple rules, can handle both numerical and categorical data, and offers high interpretability. Although statistical modeling and machine learning methods follow different methodological streams for prediction, the identified risk factors are largely consistent²².

Study objective

The reviewed studies indicate that unlicensed PTW driving can result in severe crash outcomes, particularly in developing countries. However, limited attention has been given to understanding the specific impacts of unlicensed PTW drivers, especially in the context of crash severity. Key aspects such as driver status, employment status, and age distribution have been underexplored in relation to these drivers. This study, therefore, aims to fill this gap by deeply investigating the impact of unlicensed PTW drivers on two-vehicle crashes in China.

A key novel aspect of this research lies in the use of a customized Deep Forest (DF-ptw) model, which uniquely integrates multiple tree-based algorithms to capture complex patterns in accident data. This model enables a more nuanced understanding of how different factors and their interactions contribute to the severity of crashes involving unlicensed PTW drivers. By customizing the model specifically for PTW-related crashes, we provide new insights into the underlying causes of these crashes, contributing to the existing body of knowledge.

To analyze the effects of individual factors and their interactions on crash severity, the study employed SHAP (Shapley Additive Explanations) and PDP (Partial Dependence Plots) interpretation tools. SHAP quantifies the impact of individual variables on the model’s predictions, providing an understanding of how each feature influences the likelihood of severe injury outcomes. PDP allows us to visualize the marginal effect of one or more features on the predicted outcome, helping to identify key variables that may break the accident chain and reduce crash severity.

Data processing and description

The data for this study are obtained from unlicensed PTW drivers involved in two-vehicle crashes in Shandong Province, China, from 2020 to 2022, a total of 5777 unlicensed PTW drivers involved in two-vehicle crashes are obtained after data processing. The data are provided by the Center for Accident Research in Zibo (CARZ). The classification of PTWs in this study is based on the public safety industry standards promulgated by the People’s Republic of China in 2019: Road traffic management – Types of motor vehicles. According to these standards, PTWs are categorized into three subcategories, examples as shown in the Fig. 1.

1.
Motorcycle: motorcycles with a maximum design speed greater than 50 km/h
2.
Auto-rickshaw: motorcycles equipped with two rear wheels symmetrically distributed with the front wheel

In two-vehicle crashes, the crash partner involving minibuses, light trucks, medium trucks, large buses, and heavy trucks. These vehicles are common in the transportation system. The conservation of momentum in a crash places smaller vehicles at a disadvantage when the crash partner is heavier. The difference in injury severity (DIS) between an unlicensed PTW driver and a crash partner driver (CPD) in a two-vehicle crash can be defined as,

$$DIS={S_{NPTWD}} - {S_{CPD}}$$

(1)

Where $\:{S}_{NPTWD}$ and$\:\:{S}_{CPD}$ represent the injury severity of unlicensed PTW driver and crash partner driver, respectively, with values ranging from 0 to 3 integers, indicating no injury, slight injury, serious injuries, and fatality, respectively. And, where $\:DIS$ indicates the degree of difference in injuries between the two vehicles involved in the same crash. A positive value indicates that the unlicensed PTW driver is more severely injured than the crash partner driver, while a negative value suggests the opposite. A value of zero signifies that there is no significant difference in the degree of injury between the drivers of the two vehicles. The statistics on $\:DIS$ are displayed in Table 1 basing on original data.

Table 1 Difference in injury severity statistics between PTW drivers and other drivers.

Full size table

Table 1 Illustrates that the majority of unlicensed PTW drivers experienced injury severity one level higher than the drivers of crash partner. When the crash partner is a Minibus, 71.5% of cases result in $\:DIS$=1; 64.6% for light truck; 53.6% for medium truck; 56.8% for large bus; 51.7% for heavy truck. PTWs face significantly higher vulnerability within the transportation system²³. Driver injuries can reflect the immediate consequences of crashes, providing a tangible measure of severity²⁴. Therefore, the injury severity of unlicensed PTW drivers is used to measure the severity of two-vehicle crashes. The original dataset categorized injuries into four severities: no injury (5.6%, 344 pieces of data), slight injury (66.6%, 3848 pieces of data), severe injury (5.8%, 335 pieces of data), and fatal injury (21.6%, 1250 pieces of data). A fatal injury is defined as a driver passing away within seven days. We group no and slight injuries together, labeled as NS injuries, to mitigate the influence of under-reporting on model performance. And, the severe injury and fatal injury are combined and recorded as serious or fatal Injury (SFI). The injury severity of unlicensed PTW drivers is defined as the response variable, including NS and SFI. Moreover, potential contributing factors are modeled as explanatory variables in table 2.

Table 2 Details of variables

Full size table

Methods

This study accomplishes the prediction of response variables by constructing Machine Learning classification model. The response variable, injury severity, is characterized as a categorical variable, making it suitable for prediction through classification models. The classification model outputs discrete categories that are easily understood and interpreted²⁵. Additionally, the explanatory variables encompass both discrete factors and continuous factors. Classification models adeptly handle these mixed features, effectively mapping them into discrete severity categories²⁶. The technical route of this study is shown in Fig. 2.

Optimal feature subset selection

Due to the large number of candidate features, it is necessary to perform feature selection on the explanatory variables to remove redundant features²⁷. Feature selection helps in identifying the most significant factors affecting injury severity, aiding in the development of targeted interventions and policies to improve road safety.

This study utilizes the Boruta-Shap algorithm classifying features as accepted, unaccepted, or tentative. It is an extension of the Boruta feature selection method that incorporates SHAP values for feature importance²⁸. It sets the threshold by using the SHAP value of the shaded feature. The shadow feature $\:{X}_{i}^{shadow}$ is created by duplicating the unlicensed PTW driver dataset $\:X$ and shuffling the values within each feature $\:{X}_{i}$. Combine the original features $\:X$ with shadow feature $\:{X}^{shadow}$ to form the extended dataset$\:\:{X}^{extended}$. Calculate the SHAP value of each feature based on dataset $\:{X}^{extended}$. SHAP values show the importance of each feature:

$${\emptyset _i}=\sum\limits_{{S \subseteq N\left\{ i \right\}}} {\frac{{\left| S \right|!(n - \left| S \right| - 1)!}}{{n!}}} \left[ {v(S \cup \left\{ i \right\}) - v\left( S \right)} \right]$$

(2)

where $\:{\varnothing\:}_{i}$ denotes the contribution of the $\:i\text{-}th$ feature, $\:N$ is the set of all features, $\:S$ is the subset of $\:N$ with feature $\:i$, and $\:v\left(S\right)$ is the prediction result of $\:S$. Compare the SHAP value distribution of feature with the maximum SHAP value distribution of the shadow features using the t-statistic:

$$t=\frac{{{\mu _j} - {\mu _{\hbox{max} \_shadow}}}}{{\sqrt {\frac{{\sigma _{j}^{2}}}{n}+\frac{{\sigma _{{\hbox{max} \_shadow}}^{2}}}{n}} }}$$

(3)

Where $\:{\mu\:}_{j}$ and $\:{\sigma\:}_{j}$ is the mean and standard deviation of $\:{\varnothing\:}_{i}$ respectively for each feature $\:j$, $\:{\mu\:}_{\text{m}\text{a}\text{x}\_shadow}$ and $\:{\sigma\:}_{\text{m}\text{a}\text{x}\_shadow}$ is the maximum mean and standard deviation SHAP value among shadow features. Based on the p-value from the t-test: if $\:p<0.05$, the feature is considered accepted; if $\:p>0.05$, the feature is either unaccepted, or tentative. In this study, accepted features and tentative features are used as the optimal feature subset.

Customized deep forest model

The Deep Forest involve the Multi-grained Scanning and the Cascade Forest²⁹. In this study, Random Forest and Light Gradient Boosting Machine (LightGBM), are trained using optimal feature subsets as customized predictors. The Deep Forest structure customized in this study is shown in Fig. 3.

Multi-grained scanning

Multi-Grained Scanning is designed to enhance feature representation by capturing patterns at different granularities²⁹. It helps in extracting more informative features, especially from the high-dimensional crash data used in this study. The Multi-Grained Scanning is shown in Fig. 3. The optimal feature subset with L dimensional features and C dimensional response variable (NS and SFI) is divided into 3 sub-instances using sliding windows of different dimensions. Each sliding window uses stride $\:{s}_{i}$ extracting a local region $\:{w}_{i}\text{-}dim$ of the input data. Then the number of features vector $\:{n}_{fv\_i}$ of per sub-instance can be defined as,

$${n_{fv\_i}}=\frac{{L - {w_i}}}{{{s_i}}}+1$$

(4)

Next, all the $\:{n}_{fv\_i}$ extracted from the same $\:{w}_{i}\text{-}dim$window will be used to train a Completely-Random Forest and a Random Forest. The use of the Completely-Random Forest and the Random Forest increases the diversity of the base model to reduce the risk of overfitting and improve the effectiveness of the Multi-grained Scanning³⁰. Then the class vectors are generated and concatenated as transformed features. The dimension of transformed feature vector $\:{D}_{tf\_i}$ corresponding to the original $\:L\text{-}dim$ can be defined as,

$${D_{tf\_i}}=2 * C * {n_{fv\_i}}$$

(5)

Finally, the $\:i\text{-}th$ transformed feature is used to train the $\:j\text{-}th$ level of the Cascade Forest, where $\:j$ and $\:i$ satisfy the following relationship:

$$j=i+k * M$$

(6)

Cascade forest

The Cascade Forest structure is the core architecture of Deep Forest that enables deep learning through a hierarchical, layer-by-layer processing of features³¹. It allows the model to progressively refine and improve its feature representation and predictions. The cascade layer of a traditional Cascade Forest consists of four base predictors: two Completely Random Forest and two Random Forest. The final prediction is the average of each base predictor’s results³². In this study, the trained Random Forest and LightGBM are used as predictors to construct the cascade layer of the Cascade Forest. Introducing the two types of predictors enhances the model’s flexibility, robustness, and performance³³. And the LightGBM is then stacked as a meta-learner in the$\:N\text{+}1\text{-}th$ layer to build the Cascade Forest. The Cascade Forest structure shown in Fig. 3.

The transformed $\:{D}_{tf\_1}\text{-}dim$ features generated by the Multi-grained Scanning are used as the input dataset for the level 0 of the Cascade Forest. Each predictor estimates the class distribution by calculating the percentage of different classes of training $\:{D}_{tf\_1}\text{-}dim$ features examples and then averaging these percentages across all trees in the predictor. The average estimate of class distribution on the $\:{D}_{tf\_1}\text{-}dim$features dataset for the Random Forest predictor can be defined as³⁴,

$$H({D_{tf\_1}})=\frac{1}{K}\sum\limits_{{k=1}}^{K} {{\xi _k}{h_k}({D_{tf\_1}})}$$

(7)

where, $\:k$ represents the number of decision trees integrated into the Random Forest $\:k\text{=}\text{1,2},\:\dots\:,K$. The $\:{h}_{k}\left({D}_{tf\_1}\right)$ represents the output of the $\:k\text{-}th$ decision tree for the $\:{D}_{tf\_1}\text{-}dim$ features dataset, $\:{\xi\:}_{k}$ is the weight of the $\:k\text{-}th$ decision tree. And, the average estimate of class distribution on the $\:{D}_{tf\_1}\text{-}dim$ features dataset for the LightGBM predictor can be defined as,

$$Obj=\left[ {\sum\limits_{{i=1}}^{N} {L\left( {{y_i},{{\hat {y}}^{(k - 1)}}+p} \right)} } \right]+\frac{1}{2}\lambda {P^2}+\gamma T$$

(8)

where, $\:\text{L}$ is the loss function $\:N$, is the number of samples $\:{D}_{tf\_1}\text{-}dim$ s features, $\:{y}_{i}$ represents the true value of the $\:i\text{-}th$ label$\:{\widehat{y}}^{\left(t\text{-}1\right)}$, is the predicted value from the previous decision tree. The $\:P$ is the predicted value of the $\:t\text{-}th$ decision tree, $\:T$ is the total number of nodes in the $\:t\text{-}th$ tree. And, both $\:\lambda\:$ and $\:\gamma\:$ are hyperparameters that are used to control the stride versus preventing overfitting, respectively. The estimated class distribution forms a $\:{D}_{cv\_1}$ dimensional class vector corresponding to the $\:{D}_{tf\_1}$ dimensional transformed feature vector.

Next, in the Cascade Forest structure, except for the level 0, each level is the combination of the original feature vector with the augmented $\:{D}_{tf\_1}\text{-}dim$ features vector generated by the previous layer. the $\:{D}_{cv\_1}$ dimensional class vector produced by the level 0 is concatenated with the $\:{D}_{tf\_1}\text{-}dim$ features vector to be input to the level 1 of the Cascade Forests. Similarly, the $\:{D}_{cv}$ dimensional class vector produced by the $\:\left(j\text{-}1\right)\text{-}th$ level will be concatenated with the $\:{D}_{tf\_i}\text{-}dim$ features vector to be input to the $\:j\text{-}th$level of the Cascade Forests until the convergence of validation performance³⁵.

$${D_{{\text{cv}}}}={N_{rf}} * C$$

(9)

Finally, as the layers continue to be stacked, the valid information in the features is continuously enhanced. When the final layer is reached, the $\:{D}_{cv}$ dimensional class vector will no longer be combined with the $\:{D}_{tf\_i}\text{-}dim$ features vector. And, the meta learner- LightGBM models are used to construct each level. At each node splitting, the feature with the best Gini value in the $\:\sqrt{({D}_{cv}+{D}_{tf\_i})}$ randomly selected features are chosen for splitting. The final class vector $\:{D}_{last}$ contains the probability of classifying the current sample, where the class with the highest probability $\:{MAX(D}_{last})$ is the Cascade Fores’s estimate.

Performance metrics

To assess the predictive performance of the DF-ptw, this study uses four evaluation metrics: $\:Accuracy$, $\:{Precision}_{macro}$, $\:{Recall}_{macro}$, and $\:{F}_{1}\text{-}{score}_{marco}$. The response variables in this study are 2-categorical: NSI and SFI. And, for an overall analysis of a classifier, we use macro average of $\:Precision$, $\:Recall$, and $\:{F}_{1}\text{-}score$.

The $\:Accuracy$ measures the proportion of correctly classified cases from the total number of objects in the dataset and can be calculated from Eq. 10. The $\:{Precision}_{macro}$ calculates the precision for each class separately and then takes the average of those values and can be calculated from Eq. 11. The $\:{Recall}_{macro}$ is the fraction of instances in a class that the model correctly classified out of all instances in that class and can be calculated from Eq. 12. And, the is $\:{F}_{1}\text{-}{score}_{marco}$ Harmonic mean of $\:{Precision}_{macro}$ and $\:{Recall}_{macro}$ (refer to Eq. 13).

$$A{\text{ccuracy}}=\frac{{{\text{Correct predictions}}}}{{{\text{All predictions}}}}$$

(10)

$$\Pr ecisio{n_{macro}}=\frac{{\frac{{T{P_{NS}}}}{{T{P_{NS}}+F{P_{NS}}}}+\frac{{T{P_{SFI}}}}{{T{P_{SFI}}+F{P_{SFI}}}}}}{2}$$

(11)

$$\operatorname{Re} cal{l_{macro}}=\frac{{\frac{{T{P_{NS}}}}{{T{P_{NS}}+F{N_{NS}}}}+\frac{{T{P_{SFI}}}}{{T{P_{SFI}}+F{N_{SFI}}}}}}{2}$$

(12)

$${F_1} - scor{e_{macro}}=2 * (\frac{{\Pr ecisio{n_{macro}} * \operatorname{Re} cal{l_{macor}}}}{{\Pr ecisio{n^{ - 1}}_{{macro}}+\operatorname{Re} cal{l^{ - 1}}_{{macro}}}})$$

(13)

Interpretability of model results

In this study, SHAP and PDP are used to explain the effect of factors on the predicted results of the model. The goal of SHAP is to explain the prediction of an instance $\:{x}_{i}$ by computing the contribution of each feature to the prediction. The core idea of using the SHAP summary plot in this study is to utilize Shapley values, combining feature importance with feature effects, to show the relationship between the value of a feature and its impact on the prediction. The Shapley value of a feature value is its contribution to the payout, weighted and summed over all possible feature value combinations:

$${\emptyset _j}(val)=\sum\limits_{{S \subseteq \left\{ {1,\cdots,p} \right\}}} {\frac{{\mid S \mid !(p - \mid S \mid - 1)!}}{{p!}}(va{l_x}(S \cup \left\{ j \right\}) - va{l_x}(S))}$$

(14)

where $\:S$ is a subset of the features used in the model, $\:x$ is the vector of feature values of the instance to be explained, $\:p$ the number of features, and $\:{val}_{x}\left(S\right)$ is the prediction for feature values in set $\:S$.

The partial dependence plot (PDP) shows the marginal effect one or two features have on the predicted outcome³⁶. The partial dependence function is defined as:

$${\hat {f}_S}({x_S})={E_{{X_C}}}\left[ {\hat {f}({x_S},{X_C})} \right]=\int {\hat {f}({x_S},{X_C})d} {\rm P}({X_C})$$

(15)

The $\:{x}_{S}$ represent the features for which the partial dependence function should be plotted, while $\:{X}_{C}$ denote the other features used in the machine learning model $\:\widehat{f}$, which are here treated as random variables. The feature(s) in $\:S$ are those for which the effect on the prediction is of interest. The feature vectors $\:{x}_{S}$ and $\:{X}_{C}$ together constitute the total feature space $\:X$. Partial dependence functions operate by marginalizing the machine learning model output over the distribution of the features in set $\:C$, thus illustrating the relationship the relationship between the features in set $\:S$ and the predicted outcome.

Results and discussion

Model evaluation

The unlicensed PTW driver dataset is filtered for features using Boruta-Shap, resulting in an optimal subset of 10 accepted features. The accepted features include: PTW type, Unlicensed PTW driver age, Unlicensed PTW driver employment status, Driving behavior of unlicensed PTW driver, CP type, Road functional class, Physical separation of the road, Crash time, Visibility, and Lighting condition. The green box-plot corresponds to the accepted feature, as shown in the Fig. 4.

These 10 features are used in the DF-ptw as the optimal subset for predicting injury severity classification. In addition, this study utilizes two predictors-Random Forest and LightGBM for injury severity prediction. The two-dimensional confusion matrices for the three models are shown in Fig. 5.

Model prediction accuracy metrics can be derived from the confusion matrix, as shown in Fig. 6. The DF-ptw outperforms single predictors-Random Forest and LightGBM on the optimal feature subset. The $\:Accuracy$, $\:{Precision}_{macro}$, $\:{Recall}_{macro}$, and $\:{F}_{1}\text{-}{score}_{marco}$ metrics of the DF-ptw model are higher than those of Random Forest and LightGBM.

Analysis of factors affecting

Since serious and fatal injuries involve the greatest loss of life and property, this study aimed to analyze the significant factors contributing to the SFI. This study categorizes these factors into individual-level and group-level categories. Individual-level factors refer to attributes specific to individual vehicles and drivers, not shared with others (e.g., PTW-related characteristics and CPD-related characteristics). Group-level factors encompass attributes shared by all drivers and vehicles (e.g., Roadway characteristics, Crash characteristics, and Environmental characteristics).

Different features within the subset vary in their effectiveness in influencing the prediction results. Figure 7 illustrates the feature importance ranking of SFI-class within the optimal subset. The key individual-level factors that significantly influence the probability of the unlicensed PTW driver being SFI in the two-vehicle crash, ranked in order of importance, are: the driving behavior of unlicensed PTW driver, the CP type, the unlicensed PTW driver age, the unlicensed PTW driver employment status, and the PTW type. Additionally, the group-level factors are ranked by importance as follows: the road functional class, the visibility, the crash time, the physical separation of the road, and the light condition.

In the SHAP summary plot, initial indications of the relationship between the value of a feature and its impact on the prediction are observed. However, to discern the precise nature of this relationship, partial dependence plot must be examined. Figures 8 and 9 show the PDP for individual-level factors on SFI-class and the PDP for group-level factors on SFI-class, respectively. The y-axis represents the change in the predicted value due to the change in the factors, with the blue color indicating the confidence interval.

Individual-level factors

(1) Driving behavior of unlicensed PTW driver.

The driving behavior of unlicensed PTW driver pose a significant effect on FSI, as shown in 7. Risky driving behaviors raise the likelihood of involvement in a serious crash³⁷. As shown in Fig. 8(a), compared with unlicensed PTW drivers with no abnormal driving behavior, driving under the influence of alcohol increases the probability that drivers suffer SFI in crashes by about 0.5. And, fatigued, furious, and distracted driving each increase the likelihood of unlicensed PTW driver suffers SFI in crashes, the probability of SFI increases by approximately 0.3, 0.08, and 0.05 respectively compared with no abnormal driving behavior. Given the significant impact of alcohol consumption on crash severity, we recommend enhancing the scrutiny of PTW driving qualifications and imposing stricter penalties for unlicensed PTW riders found driving under the influence of alcohol³⁸. This approach aims to reduce alcohol-related crash risks and improve road safety.

(2) CP type.

The SHAP value gradually increases as the CP type becomes larger and heavier, in the Fig. 7. Due to the substantial disparities between PTW and large vehicles, unlicensed PTW drivers are more likely to sustain serious injuries when the crash partner is a large vehicle³⁹. Compared with crash with the minibus, crashes involving the light truck, medium truck, bus, or heavy truck increase the probability of SFI by approximately 0.03, 0.06, 0.09, and 0.125, respectively, as shown in Fig. 8(b). It is important to note that unlicensed PTW drivers are often seriously injured in crashes involving heavy trucks. Heavy trucks have large blind spots, making it difficult to notice other vehicles when at intersections, turning, or reversing, which can easily lead to serious crashes⁴⁰. Additionally, the weight of these vehicles results in significant inertia, making them difficult to control during sudden emergencies⁴¹. This leads to longer braking distances and an increased likelihood of crashes. PTW riders are more vulnerable in collisions with larger vehicles, emphasizing the need for enhanced safety measures. To mitigate risks, it is recommended that traffic control authorities promote the installation of blind spot detection systems in heavy vehicles and encourage drivers to slow down or stop at intersections or turns to better observe their surroundings.

(3) Unlicensed PTW driver age.

The SHAP value increases gradually with the unlicensed PTW drivers age, as shown in Fig. 7. Unlicensed PTW drivers aged 20 to 45 tend to have a lower probability of SFI in two-vehicle crashes compared to those aged 18 to 20, possibly due to greater driving experience and emergency response skills. And, older unlicensed PTW drivers have a positive effect on SFI. This has been confirmed by previous studies, which show that older individuals are physically weaker and more likely to be seriously injured in a crash⁴². Changes in predicted values became evident for drivers around older than 53 years, as shown in Fig. 8 (c). And, the probability of SFI in the elderly increases by approximately 0.125 compared with those aged 18 to 20. It is recommended that traffic control authorities intensify checks on the driving qualifications of elderly PTW riders to reduce the number of unlicensed elderly drivers. Additionally, road safety education should be provided to raise safety awareness among elderly drivers.

(4) Unlicensed PTW driver employment status.

The employment status of unlicensed PTW drivers also impacts the injury severities they sustain in two-vehicle crashes.as shown in Fig. 7. Compared to farmer unlicensed PTW drivers, blue-collar, white-collar, or self-employed increase the probability of SFI by approximately 0.03, 0.05, and 0.08, respectively. as shown in Fig. 8(d). Variations in driving styles among unlicensed PTW drivers in different employment status may explain these differences. It is worth noting that some self-employed individuals may perceive obtaining a driver’s license as requiring significant time and financial investment and may not consider it a necessary expense. A study from Sweden found that unlicensed drivers from self-employed families had a higher risk estimate for severe injury than has been reported in other studies⁴³. Since self-employed unlicensed drivers are at higher risk due to potential lack of formal training, it is crucial to design education programs that focus on safe driving practices, including the dangers of driving under fatigue or while distracted.

(5) PTW type.

When PTWs crash with four-wheeled vehicles, the type of PTW has a limited impact on crash severity, though some effect is still present. Auto-rickshaw has a SHAP value less than 0, which has a negative effect on SFI, as shown in Fig. 7. Compared to unlicensed auto-rickshaw drivers, unlicensed motorcycle drivers increase the probability of SFI by approximately 0.02, as shown in Fig. 8 (e). The risk of SFI is higher for unlicensed motorcycle drivers due to the lack of safety equipment like helmets and seat belts. The higher risk of severe injury among unlicensed motorcycle riders is primarily due to the lack of safety equipment, such as helmets and seat belts, which increases vulnerability in crashes. It is recommended to enforce policies requiring all PTW riders to use safety gear to reduce injury severity. Additionally, promoting the importance of protective equipment through media campaigns, community activities, and driving schools can raise awareness and emphasize the critical role helmets and other safety gear play in rider safety.

Group-level factors

(1) Road functional class.

Traffic volumes, speed limits, and safety facilities vary across different road functional classes, leading to differences in crash severity⁴⁴. According to Fig. 7, the road functional class as a group-level factor poses a significant effect on SFI. National and provincial roads have higher speed limits and traffic volumes compared with urban and rural roads, making serious crashes more likely⁴⁵. Crashes on national and provincial roads increase the probability of SFI by approximately 0.06 compared to urban roads, as shown in Fig. 9(a). There is a huge speed difference between PTWs and four-wheelers on national and provincial roads, which can lead to serious outcomes in crash⁴⁶. Due to poorer infrastructure and road conditions on rural roads, two-vehicle crashes involving the unlicensed PTW driver on a rural roadway increases the probability of SFI by approximately 0.03. Given the increased severity of crashes on national and provincial roads, it is essential to implement stricter speed limits, improved traffic controls, and better road infrastructure maintenance on these roads to reduce crash severity.

(2) Visibility.

Visibility has consistently been a crucial factor in traffic safety. Figure 7 illustrates that low visibility significantly impacts the likelihood of SFI. Compared with visibility greater than 200 m, visibility below 50 m increases the probability of SFI by approximately 0.05, as shown in Fig. 9(b). Studies have shown that low visibility reduces sight distance and greatly increases the risk of traffic crashes, especially on high-traffic roadways prone to rear-end crashes⁴⁷. To address the significant impact of low visibility on crash severity, local authorities should invest in enhanced lighting systems, particularly on roads with limited visibility, such as rural roads or high-traffic areas prone to rear-end crashes.

(3) Crash time.

Factors like traffic volume, visibility, lighting conditions, and driving behavior can change over time. These changes cause varying effects on crash severity across different time periods. The study found that the probability of SFI increased by approximately 0.04 around midnight compared to 7 a.m., as shown in Fig. 9(c). Late-night hours are often associated with a higher incidence of impaired driving due to alcohol consumption or fatigue, both of which are major contributors to severe crashes. And drivers tend to show more risky and aggressive behaviors at night due to reduced traffic volume⁴⁸. Our study shows that the probability of serious injuries increases around midnight, targeted interventions such as increased patrols or temporary traffic restrictions during high-risk hours could be effective in reducing the likelihood of crashes during these times.

(4) Physical separation of the road.

The form of road segregation is crucial for defining user right-of-way and absorbing crash impact energy⁴⁹. As roadway separation improves (from no separation to type A (only central separation) or type B (only motorway and non-motorway separation) to both type A and B), the SHAP value decreases to less than zero, indicating that enhanced separation has a negative effect on SFI as shown in Fig. 7. This means that better separation reduces the likelihood of severe outcomes in crashes⁵⁰. Compared with no separation, both type A and B separation reduces the probability of SFI by approximately 0.04, as shown in Fig. 9 (d). This finding supports the idea that better physical separation of different types of road users significantly mitigates the severity of crashes.

(5) Light condition.

Lighting conditions have a similar effect on crash severity as visibility. The likelihood of SFI increases at dawn, dusk, and during unlit nighttime conditions, as shown in Fig. 9(e). Dazzle can occur at dusk or dawn, while the headlights of oncoming traffic interfere with a driver’s vision at night, especially when there are no streetlights⁵¹. And, owing to reduced visible range drivers need longer reaction times and space to decelerate when meeting dangerous situations. Enhancing road lighting, particularly at high-risk areas such as intersections or rural roads, would reduce crash risk during night-time driving.

Conclusion

This study aims to explore the significant influencing factors of Chinese unlicensed PTW drivers in two-vehicle crashes. To achieve this, we constructed a customized Deep Forest Model (DF-ptw) to classify and predict high-dimensional crash data. The Multi-granularity Scanning structure in DF-ptw captures both local and global patterns of the data and enhances the feature representation of the model. Additionally, it enhances the model’s robustness and performance by incorporating multiple types of predictors-Random Forest and LightGBM within the Cascade Forest structure of DF-ptw. The prediction results show that the customized DF-ptw outperforms single-predictor-Random Forest and LightGBM in predicting.

Using SHAP and PDPbox based on the meta-learner of the DF-ptw, key factors influencing two-vehicle crashes involving unlicensed PTW drivers are revealed. Unlicensed drivers who are under the influence of alcohol, unlicensed motorcycle drivers, self-employed unlicensed drivers, and those older than 53 years have a significantly higher risk of sustaining serious injuries in two-vehicle crashes. Additionally, crashes occurring on national and provincial roads, on non-separated roadways, and during late-night hours further increase the likelihood of serious injury or fatality for unlicensed PTW drivers.

In response to these findings, several targeted recommendations are proposed. These include strengthening efforts to scrutinize PTW vehicle driving qualifications to reduce the number of unlicensed drivers. Media campaigns, community events, and driving schools should be leveraged to promote the importance of obtaining a driver’s license and wearing protective gear, with a focus on older drivers and self-employed individuals. Additionally, increasing police presence on the roads to impose stricter penalties on unlicensed PTW drivers is recommended. Traffic authorities should also advocate for the installation of blind spot detection systems on heavy vehicles and encourage drivers to slow down or stop at intersections or turns to prevent crash with PTWs. Improving road lighting in high-risk areas, especially at night, to boost visibility and decrease collision likelihood is also essential. These recommendations aim to reduce the number of unlicensed PTW drivers and mitigate the severity of crashes, ultimately enhancing overall road safety.

There are still some limitations in the current study. The crash data in this study is collected from a province in eastern China. Geographic differences may limit the applicability of these conclusions to northern and western regions. And that the findings may not be directly generalizable to other regions with different traffic conditions, regulations, or demographic characteristics. In future studies, crash severity analysis will focus on mountainous and rural areas in western China, incorporating vehicle data with points of interest (POI) and more detailed data to explore the impact of other relevant factors on crash severity.

Data availability

The datasets generated and analysed during the current study are available from the corresponding author on reasonable request.

References

WHO. Powered two-and three-wheeler safety: a road safety manual for decision-makers and practitioners, 2nd ed. (2022).
Wang, X., Peng, Y., Yi, S., Wang, H. & Yu, W. Risky behaviors, psychological failures and kinematics in vehicle-to-powered two-wheeler accidents: results from in-depth Chinese crash data. Accid. Anal. Prev. 156, 106150. https://doi.org/10.1016/j.aap.2021.106150 (2021).
Article PubMed MATH Google Scholar
Puthan, P., Lubbe, N., Shaikh, J., Sui, B. & Davidsson, J. Defining crash configurations for powered Two-Wheelers: comparing ISO 13232 to recent in-depth crash data from Germany, India and China. Accid. Anal. Prev. 151, 105957. https://doi.org/10.1016/j.aap.2020.105957 (2021).
Article PubMed Google Scholar
Wang, C., Xu, C., Xia, J. & Qian, Z. The effects of safety knowledge and psychological factors on self-reported risky driving behaviors including group violations for e-bike riders in China. Transp. Res. part. F: Traffic Psychol. Behav. 56, 344–353. https://doi.org/10.1016/j.trf.2018.05.004 (2018).
Article MATH Google Scholar
Talbot, R., Brown, L. & Morris, A. Why are powered two wheeler riders still fatally injured in road junction crashes?–A causation analysis. J. Saf. Res. 75, 196–204. https://doi.org/10.1016/j.jsr.2020.09.009 (2020).
Article Google Scholar
Li, J., Fang, S., Guo, J., Fu, T. & Qiu, M. A motorcyclist-injury severity analysis: a comparison of single-, two-, and multi-vehicle crashes using latent class ordered probit model. Accid. Anal. Prev. 151, 105953. https://doi.org/10.1016/j.aap.2020.105953 (2021).
Article PubMed Google Scholar
Brown, L. et al. Investigation of accidents involving powered two wheelers and bicycles–A European in-depth study. J. Saf. Res. 76, 135–145. https://doi.org/10.1016/j.jsr.2020.12.015 (2021).
Article MATH Google Scholar
Ijaz, M., Zahid, M. & Jamal, A. A comparative study of machine learning classifiers for injury severity prediction of crashes involving three-wheeled motorized rickshaw. Accid. Anal. Prev. 154, 106094. https://doi.org/10.1016/j.aap.2021.106094 (2021).
Article PubMed MATH Google Scholar
Schneider, W. H. IV, Savolainen, P. T., Van Boxel, D. & Beverley, R. Examination of factors determining fault in two-vehicle motorcycle crashes. Accid. Anal. Prev. 45, 669–676. https://doi.org/10.1016/j.aap.2011.09.037 (2012).
Article PubMed Google Scholar
Chiou, Y. C., Hwang, C. C., Chang, C. C. & Fu, C. Reprint of modeling two-vehicle crash severity by a bivariate generalized ordered probit approach. Accid. Anal. Prev. 61, 97–106. https://doi.org/10.1016/j.aap.2013.07.005 (2013).
Article PubMed MATH Google Scholar
Zou, R. et al. Analyzing driver injury severity in two-vehicle rear-end crashes considering leading-following configurations based on passenger car and light truck involvement. Accid. Anal. Prev. 193, 107298. https://doi.org/10.1016/j.aap.2023.107298 (2023).
Article PubMed MATH Google Scholar
Haque, M. M., Chin, H. C. & Debnath, A. K. An investigation on multi-vehicle motorcycle crashes using log-linear models. Saf. Sci. 50 (2), 352–362. https://doi.org/10.1016/j.ssci.2011.09.015 (2012).
Article Google Scholar
Chiou, Y. C., Fu, C. & Ke, C. Y. Modelling two-vehicle crash severity by generalized estimating equations. Accid. Anal. Prev. 148, 105841. https://doi.org/10.1016/j.aap.2020.105841 (2020).
Article PubMed Google Scholar
Tamakloe, R., Das, S., Aidoo, E. N. & Park, D. Factors affecting motorcycle crash casualty severity at signalized and non-signalized intersections in Ghana: insights from a data mining and binary logit regression approach. Accid. Anal. Prev. 165, 106517. https://doi.org/10.1016/j.aap.2021.106517 (2022).
Article PubMed Google Scholar
Goel, R. Modelling of road traffic fatalities in India. Accid. Anal. Prev. 112, 105–115. https://doi.org/10.1016/j.aap.2017.12.019 (2018).
Article PubMed PubMed Central MATH Google Scholar
Priye, S. & Manoj, M. Passengers’ perceptions of safety in paratransit in the context of three-wheeled electric rickshaws in urban India. Saf. Sci. 124, 104591. https://doi.org/10.1016/j.ssci.2019.104591 (2020).
Article MATH Google Scholar
Hanna, C. L., Laflamme, L. & Bingham, C. R. Fatal crash involvement of unlicensed young drivers: county level differences according to material deprivation and urbanicity in the United States. Accid. Anal. Prev. 45, 291–295. https://doi.org/10.1016/j.aap.2011.07.014 (2012).
Article PubMed Google Scholar
Martín-delosReyes, L. M., Martínez-Ruiz, V., Rivera-Izquierdo, M., Jiménez-Mejías, E. & Lardelli-Claret, P. Is driving without a valid license associated with an increased risk of causing a road crash? Accid. Anal. Prev. 149, 105872. https://doi.org/10.1016/j.aap.2020.105872 (2021).
Article PubMed Google Scholar
Santos, K., Dias, J. P. & Amado, C. A literature review of machine learning algorithms for crash injury severity prediction. J. Saf. Res. 80, 254–269. https://doi.org/10.1016/j.jsr.2021.12.007 (2022).
Article MATH Google Scholar
Ali, Y., Hussain, F. & Haque, M. M. Advances, challenges, and future research needs in machine learning-based crash prediction models: a systematic review. Accid. Anal. Prev. 194, 107378. https://doi.org/10.1016/j.aap.2023.107378 (2024).
Article PubMed MATH Google Scholar
Makridakis, S., Spiliotis, E. & Assimakopoulos, V. Statistical and machine learning forecasting methods: concerns and ways forward. PloS One. 13 (3). https://doi.org/10.1371/journal.pone.0194889 (2018). e0194889, DOI.
Mansoor, U., Jamal, A., Su, J., Sze, N. N. & Chen, A. Investigating the risk factors of motorcycle crash injury severity in Pakistan: insights and policy recommendations. Transp. Policy. 139, 21–38. https://doi.org/10.1016/j.tranpol.2023.05.013 (2023).
Article Google Scholar
Huang, H., Siddiqui, C. & Abdel-Aty, M. Indexing crash worthiness and crash aggressivity by vehicle type. Accid. Anal. Prev. 43 (4), 1364–1370. https://doi.org/10.1016/j.aap.2011.02.010 (2011).
Article PubMed Google Scholar
Cai, Z. & Wei, F. Modelling injury severity in single-vehicle crashes using full bayesian random parameters multinomial approach. Accid. Anal. Prev. 183, 106983. https://doi.org/10.1016/j.aap.2023.106983 (2023).
Article PubMed MATH Google Scholar
Salas, P., De la Fuente, R., Astroza, S. & Carrasco, J. A. A systematic comparative evaluation of machine learning classifiers and discrete choice models for travel mode choice in the presence of response heterogeneity. Expert Syst. Appl. 193, 116253. https://doi.org/10.1016/j.eswa.2021.116253 (2022).
Article Google Scholar
Zhao, X., Yan, X., Yu, A. & Van Hentenryck, P. Prediction and behavioral analysis of travel mode choice: a comparison of machine learning and logit models. Travel Behav. Soc. 20, 22–35. https://doi.org/10.1016/j.tbs.2020.02.003 (2020).
Article MATH CAS Google Scholar
Chandrashekar, G. & Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 40 (1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024 (2014).
Article MATH Google Scholar
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13. https://doi.org/10.18637/jss.v036.i11 (2010).
Article MATH Google Scholar
Hu, G., Li, H., Xia, Y. & Luo, L. A deep Boltzmann machine and multi-grained scanning forest ensemble collaborative method and its application to industrial fault diagnosis. Comput. Ind. 100, 287–296. https://doi.org/10.1016/j.compind.2018.04.002 (2018).
Article MATH Google Scholar
Xia, S., Wang, G., Chen, Z. & Duan, Y. Complete random forest based class noise filtering learning for improving the generalizability of classifiers. IEEE Trans. Knowl. Data Eng. 31 (11), 2063–2078. https://doi.org/10.1109/TKDE.2018.2873791 (2018).
Article MATH Google Scholar
Liu, X., Wang, R., Cai, Z., Cai, Y. & Yin, X. Deep multigrained cascade forest for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 57 (10), 8169–8183. https://doi.org/10.1109/TGRS.2019.2918587 (2019).
Article ADS MATH Google Scholar
Sesmero, M. P., Iglesias, J. A., Magán, E., Ledezma, A. & Sanchis, A. Impact of the learners diversity and combination method on the generation of heterogeneous classifier ensembles. Appl. Soft Comput. 111, 107689. https://doi.org/10.1016/j.asoc.2021.107689 (2021).
Article Google Scholar
Tang, J., Liang, J., Han, C., Li, Z. & Huang, H. Crash injury severity analysis using a two-layer stacking framework. Accid. Anal. Prev. 122, 226–238. https://doi.org/10.1016/j.aap.2018.10.016 (2019).
Article PubMed MATH Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Article MATH Google Scholar
Liu, X. et al. Deep forest based intelligent fault diagnosis of hydraulic turbine. J. Mech. Sci. Technol. 33, 2049–2058. https://doi.org/10.1007/s12206-019-0408-9 (2019).
Article MATH Google Scholar
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals Stat. 29 (5), 1189–1232 (2001). http://www.jstor.org/stable/2699986
MathSciNet MATH Google Scholar
Maistros, A., Schneider, I. V., Savolainen, P. T. & W. H., & A comparison of contributing factors between alcohol related single vehicle motorcycle and car crashes. J. Saf. Res. 49, 129–e1. https://doi.org/10.1016/j.jsr.2014.03.002 (2014).
Article Google Scholar
Mohamad, I. Gender disparities in rural motorcycle accidents: a neural network analysis of travel behavior impact. Accid. Anal. Prev. 210, 107840. https://doi.org/10.1016/j.aap.2024.107840 (2025).
Article PubMed MATH Google Scholar
Hyun, K. K., Jeong, K., Tok, A. & Ritchie, S. G. Assessing crash risk considering vehicle interactions with trucks using point detector data. Accid. Anal. Prev. 130, 75–83. https://doi.org/10.1016/j.aap.2018.03.002 (2019).
Article PubMed Google Scholar
Jansen, R. J. & Varotto, S. F. Caught in the blind spot of a truck: a choice model on driver glance behavior towards cyclists at intersections. Accid. Anal. Prev. 174, 106759. https://doi.org/10.1016/j.aap.2022.106759 (2022).
Article PubMed Google Scholar
Wei, F., Xu, P., Guo, Y. & Wang, Z. Qualitatively and quantitatively explore injury severity of light motor vehicle drivers involved in heavy goods vehicle crashes. Transp. Lett. 16 (10), 1353–1365. https://doi.org/10.1080/19427867.2024.2306009 (2024).
Article MATH Google Scholar
Chen, S. J., Chen, C. Y. & Lin, M. R. Risk factors for crash involvement in older motorcycle riders. Accid. Anal. Prev. 111, 109–114. https://doi.org/10.1016/j.aap.2017.11.006 (2018).
Article PubMed MATH Google Scholar
Hanna, C. L., Hasselberg, M., Laflamme, L. & Möller, J. Road traffic crash circumstances and consequences among young unlicensed drivers: a Swedish cohort study on socioeconomic disparities. BMC Public. Health. 10, 1–8. https://doi.org/10.1186/1471-2458-10-14 (2010).
Article Google Scholar
Anderson, J. & Hernandez, S. Roadway classifications and the accident injury severities of heavy-vehicle drivers. Analytic Methods Accid. Res. 15, 17–28. https://doi.org/10.1016/j.amar.2017.04.002 (2017).
Article MATH Google Scholar
Celik, A. K. & Oktay, E. A multinomial logit analysis of risk factors influencing road traffic injury severities in the Erzurum and Kars Provinces of Turkey. Accid. Anal. Prev. 72, 66–77. https://doi.org/10.1016/j.aap.2014.06.010 (2014).
Article PubMed MATH Google Scholar
Guo, Y., Li, Z., Liu, P. & Wu, Y. Modeling correlation and heterogeneity in crash rates by collision types using full bayesian random parameters multivariate Tobit model. Accid. Anal. Prev. 128, 164–174. https://doi.org/10.1016/j.aap.2019.04.013 (2019).
Article PubMed MATH Google Scholar
Peng, Y., Abdel-Aty, M., Shi, Q. & Yu, R. Assessing the impact of reduced visibility on traffic crash risk using microscopic data and surrogate safety measures. Transp. Res. part. C: Emerg. Technol. 74, 295–305. https://doi.org/10.1016/j.trc.2016.11.022 (2017).
Article MATH Google Scholar
Hossain, M. M., Zhou, H. & Das, S. Data mining approach to explore emergency vehicle crash patterns: a comparative study of crash severity in emergency and non-emergency response modes. Accid. Anal. Prev. 191, 107217. https://doi.org/10.1016/j.aap.2023.107217 (2023).
Article PubMed MATH Google Scholar
Hossain, A., Sun, X., Shahrier, M., Islam, S. & Alam, S. Exploring nighttime pedestrian crash patterns at intersection and segments: findings from the machine learning algorithm. J. Saf. Res. 87, 382–394. https://doi.org/10.1016/j.jsr.2023.08.010 (2023).
Article MATH Google Scholar
Das, S., Avelar, R., Dixon, K. & Sun, X. Investigation on the wrong way driving crash patterns using multiple correspondence analysis. Accid. Anal. Prev. 111, 43–55. https://doi.org/10.1016/j.aap.2017.11.016 (2018).
Article PubMed Google Scholar
Brijs, T., Karlis, D. & Wets, G. Studying the effect of weather conditions on daily crash counts using a discrete time-series model. Accid. Anal. Prev. 40 (3), 1180–1190. https://doi.org/10.1016/j.aap.2008.01.001 (2008).
Article PubMed MATH Google Scholar

Download references

Funding

This research was jointly supported by (1) Natural Science Foundation of Shandong Province (No. ZR2024MG014); (2) Shandong Provincial Programme of Introducing and Cultivating Talents of Discipline to Universities: Research and Innovation Team of Intelligent Connected Vehicle Technology (Grant No. 2021SLG08); (3) the Project of the Open Fund of State Key Lab of Intelligent Transportation System (Grant No. 2024-B009); (4) SDUT & Zibo City Integration Development Project (Grant No.2022JS005).

Author information

Peixiang Xu and Fulu Wei contributed equally.

Authors and Affiliations

School of Transportation and Vehicle Engineering, Shandong University of Technology, Zibo, 255000, China
Peixiang Xu, Fulu Wei, Dong Guo, Yongqing Guo, Lizu Sun & Chuan Liu
State Key Lab of Intelligent Transportation System, Beijing, 100000, China
Bin Zhou

Authors

Peixiang Xu
View author publications
Search author on:PubMed Google Scholar
Fulu Wei
View author publications
Search author on:PubMed Google Scholar
Dong Guo
View author publications
Search author on:PubMed Google Scholar
Yongqing Guo
View author publications
Search author on:PubMed Google Scholar
Lizu Sun
View author publications
Search author on:PubMed Google Scholar
Chuan Liu
View author publications
Search author on:PubMed Google Scholar
Bin Zhou
View author publications
Search author on:PubMed Google Scholar

Contributions

P.X.: Conceptualization, methodology, validation, Writing—review & editing; F.W.: software, investigation, validation, formal analysis, visualization, writing—original draft; D.G.: supervision, methodology, project administration, funding; Y.G.: formal analysis, resources, funding; L.S.: visualization; C.L.: validation, resources; B.Z.: validation, resources. All authors reviewed the manuscript.

Corresponding author

Correspondence to Fulu Wei.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xu, P., Wei, F., Guo, D. et al. Exploring the injury severity of unlicensed powered two- and three-wheeler drivers in two-vehicle crashes in China. Sci Rep 15, 11802 (2025). https://doi.org/10.1038/s41598-025-88896-3

Download citation

Received: 18 December 2024
Accepted: 31 January 2025
Published: 06 April 2025
DOI: https://doi.org/10.1038/s41598-025-88896-3