Introduction

With rapid urbanization, urban roads are continuously expanding, and intersections within the road network are becoming increasingly dense, resulting in many intersections with short connecting segments, known as short-distance intersections1. Due to the limited space for queuing vehicles in these short segments and the influence of traffic signals from both upstream and downstream intersections, traffic spillover often occurs—where the queue length of vehicles at a downstream intersection exceeds the segment length and spills over into the upstream intersection2. Severe traffic spillover can lead to gridlock across the entire road network, making the identification and control of traffic spillover a focal point in current traffic management research.

Over the past decade, numerous methods have been developed for identifying traffic spillover state. The most fundamental approach involves installing detectors such as loops, radars, and video cameras at the exits of upstream intersections3,4, and determining whether queue length spillover occurs at downstream intersections by monitoring vehicle occupancy rates5. Although these methods can identify traffic spillover, their effectiveness is limited by the detection range of the detectors. Therefore, some studies6,7,8,9 have established spillover indices or algorithmic models based on collected traffic data such as flow, queue length, speed, and density to identify traffic spillover state in real time, providing insights for the traffic spillover identification research in this paper. Additionally, these traffic spillover identification methods lay the foundation for traffic spillover control. Most current studies immediately implement traffic spillover control measures after identifying spillover to reduce vehicle queue length and address intersection spillover issues9. While these measures improve queue spillover to some extent, they also restrict the normal flow of merging vehicles, increasing intersection delays, and overlook the possibility of traffic spillover dissipating on their own. This is particularly relevant in short-distance segments between intersections, typically ranging from 50 to 200 m, where traffic spillover occurs frequently10. If traffic signal is adjusted immediately each time spillover occurs, the frequent changes in signal plans may hinder vehicle flow at short-distance intersections, significantly impacting their traffic capacity. Therefore, for short-distance intersections, if traffic spillover can be identified in advance through predictive methods, and further predictions are made to determine whether the spillover will dissipate in time, signal control strategies can be selectively applied to address the issue. Consequently, we focus on short-distance intersections and propose to conduct research on predicting the dissipation of traffic focus based on its identification.

Additionally, the above studies reveal that vehicle queue length is crucial for identifying and dissipating traffic spillover. Currently, many researchers have conducted studies on queue length estimation and prediction. Some studies directly estimate and predict queue length11,12,13. However, due to the high dynamics and uncertainty of traffic systems, factors such as traffic flow, vehicle speed, and signal timing are constantly changing, resulting in queue length lacking obvious patterns14. This makes it difficult to directly and accurately predict future queue length. To address this, indirect estimation and prediction methods have emerged15,16,17,18, which indirectly predict queue length by analyzing or forecasting parameters such as traffic flow15, cycle time16,17, and offset18. These methods offer advantages such as easy data acquisition, controllable accuracy, obvious periodic variation characteristics, and good queue length prediction performance. Therefore, machine learning models can be further employed to predict queue length by integrating the above-mentioned data.

At present, machine learning models commonly used for traffic data prediction include Decision Tree (DT), Convolutional Neural Network (CNN), Random Forest (RF), Long Short-Term Memory Network (LSTM), and Gated Recurrent Unit (GRU). Compared with DT, CNN and RF models, LSTM and GRU are more suitable for processing and predicting important events with relatively long intervals and delays in time series, demonstrating higher prediction accuracy in some traffic application scenarios19,20. However, since LSTM and GRU can only process sequential data in a forward direction and cannot capture reverse contextual information, while the current state of traffic data is related to both past and future states, models that can handle long-term dependencies in time series and capture contextual information in both forward and backward directions are preferred. For example, the Bidirectional Long Short-Term Memory Network (Bi-LSTM) model, through the mechanisms of forget gate, input gate, and output gate, can more finely control the flow and storage of information in time series, making it more accurate and effective in capturing and utilizing long-term dependent information. Additionally, its feature of capturing contextual information in both directions allows the model to consider both past and future contexts simultaneously21, making it potentially more suitable for traffic data prediction. Therefore, this paper proposes using the Bi-LSTM model for prediction.

Based on the above analysis, the formation and dissipation of traffic spillover are closely related to vehicle queue length, which in turn is influenced by intersection flow, cycle time, and offset schemes. Therefore, the main approach of this study on traffic spillover identification and dissipation prediction for short-distance intersections is as follows: First, using intersection flow, cycle time, offset, and other relevant traffic data, the Bi-LSTM model is applied to predict the queue length of stranded vehicles. On this basis, further research is conducted on traffic spillover identification and dissipation prediction by integrating the predicted queue length of stranded vehicles with other relevant traffic data. Therefore, we address the problem of traffic spillover identification and dissipation prediction for short-distance intersections through the following three innovative contributions. First, by combining traffic wave theory, it proposes criteria for identifying and distinguishing traffic spillover and dissipation at short-distance intersections, laying the foundation for predicting traffic spillover dissipation at such intersections. Second, it constructs the input data set required for the machine learning prediction model by utilizing methods for calculating the queue length of stranded vehicles on the segment and the traffic spillover dissipation state, thereby improving the model’s interpretability. Third, based on the Bi-LSTM model, it develops a two-stage prediction model, where the queue length of stranded vehicles predicted in the first stage serves as feature data for the second stage, effectively enhancing the prediction accuracy of traffic spillover dissipation.

The remainder of this paper is organized as follows. The literature review is introduced in Sect. "Literature review".The research methodology is introduced in Sect. "Method". The case study results are presented in Sect. "Results". The findings and the reasons behind these results are discussed in Sect. "Discussion". The conclusions are given in Sect. "Conclusions".

Literature review

In terms of traffic spillover state identification, Geroliminis et al.22 proposed a method to identify spillover through changes in occupancy rate detected by loop detectors, and provided an occupancy threshold for spillover occurrence. Shi Xiaowei et al.3 used video detectors to obtain real-time queue length on road segments to identify traffic spillover state. Ren et al.4 proposed a method to identify vehicle queue spillover based on actual speed measurements from detectors. Wu et al.23 used high-resolution traffic signal data to measure spillover index at oversaturated intersections in either the temporal or spatial dimension. Zhang Lidong et al.6 proposed a fuzzy recognition algorithm for traffic spillover based on vehicle queue ratio and average segment speed as input data, grounded in the theory of intelligent fuzzy inference. Ma et al.7 proposed a method to determine the spillover threshold occupancy using traffic flow data collected from upstream fixed detectors. Zhu Renwei et al.8 based on detector data and traffic wave theory, proposed a vehicle spillover identification model for bottleneck sections by comparing the maximum generalized queue length with the spillover safety distance.Zhang et al.9 using advanced wide-area radar detection data, established a comprehensive spillover identification index with average speed and space occupancy as sub-indicators. Zhang et al.24 used the unmanned aerial vehicle remote sensing for preliminary identification of intersection spillover. The above studies show that most research can identify traffic spillover state in real time through detectors or by establishing spillover indicators and algorithm models, but there is relatively little research on predicting traffic spillover state.

In terms of traffic spillover control, Zhang et al. 9 immediately applied a matched control strategy based on the traffic demand in the non-spillover phase of the signalized intersection after detecting traffic spillover through wide-area radar, in order to control the queue length of the traffic spillover. Cesme et al.25 proposed an self-organizing ontrol logic for oversaturated intersection, where, if the spillover detector identifies spillover on the main road, it immediately reduces the green light time for the side road and extends that for the main road to reduce spillover queue length. Zhang Lidong et al.26 optimized the signal control plan through an intelligent traffic spillover coordination control algorithm immediately after detecting traffic spillover, allowing the queue of spillover vehicles to gradually dissipate. Yao et al.27 upon detecting traffic spillover, immediately used fuzzy control to reallocate green light time for the spillover direction. By implementing minimal, shortened, or no green light, they restricted merging vehicles and promptly controlled the spillover queue length. The above methods immediately control traffic after spillover occurs at intersections by adjusting traffic signal to restrict vehicle merging or facilitate vehicle dispersal. This reduces queue length, allowing spillover vehicles to dissipate, indicating that queue length has a significant impact on the dissipation of traffic spillover. The above studies indicate that most research adopts traffic spillover control measures immediately after identifying spillover to reduce vehicle queue lengths and address intersection spillover issues. However, these measures restrict the normal flow of merging vehicles, leading to increased intersection delays, and overlook the possibility of traffic spillovers dissipating spontaneously.

In terms of queue length estimation and prediction, Lee et al.11 developed a real-time lane-based queue length prediction method using CNN and LSTM algorithms. Rahman R et al.12 used a LSTM to predict queue length for the next cycle. Comert et al.13 applied traffic wave theory and the Robertson dispersion model to predict traffic queue length in real-time. However, due to the high dynamics and uncertainty of traffic systems, it is difficult to directly and accurately predict future queue length. Therefore, some indirect estimation and prediction methods have emerged. Dai et al.15 suggested that queue length increases continuously with a sharp rise in traffic flow and developed a queue length prediction model based on deterministic queuing theory, building on traffic flow forecasting. Liu et al.16 based on traffic wave theory and high-resolution traffic signal data, proposed a method to predict the current cycle’s queue length by performing regression analysis on the previous cycle’s queue length. Yao et al. 18considering theoffset between upstream and downstream intersections, proposed a periodic vehicle queue length estimation method based on fixed detector data. Zhao et al.28 proposed a probabilistic approach to establish and solve a single-variable formula for detecting vehicle penetration rates, which can be used to estimate penetration rates and predict traffic flow and queue length. Abewickrema et al.17 proposed a novel multivariate time-varying Kalman filter approach to estimate the cycle-based maximum queue lengths in real-time by only using high-resolution vehicle loop detector data and signal timing data. The above studies indicate that queue length can be predicted through direct or indirect methods, but indirect queue length prediction methods are superior, offering advantages such as easy data acquisition, controllable accuracy, obvious periodic variation characteristics, and good queue length prediction performance.

In terms of traffic data prediction based on machine learning models, Xia et al.29 proposed a traffic flow forecasting method based on gradient boosting DT. Fu et al.30 proposed a traffic data prediction method that integrates Transformer and CNN based on various nonlinear spatial relationships of correlated intersections and the potential temporal dependencies in traffic volume. Wumaier et al.31 introduced a dynamic traffic data prediction method based on the RF algorithm. Katambire et al.19 used LSTM and the Auto Regressive Integrated Moving Average model to predict future traffic rates. The results showed that LSTM is the best-fitting model for monthly traffic data prediction. However, compared with DT, CNN, RF, LSTM and GRU models, the Bi-LSTM model can process both forward and backward sequence information, which can not only utilize the traffic state information at past moments, but also consider the impact of future moments on the current state, and capture more comprehensive time-series features. Therefore, some scholars have carried out research on traffic data prediction based on the Bi-LSTM model. For example, Ma et al.32 proposed a short-term traffic flow prediction model based on time series analysis of traffic flow and Bi-LSTM to improve prediction accuracy and deeply explore the spatiotemporal characteristics of traffic flow data. Ounoughi C et al.33 introduced a Bi-LSTM-based model for traffic speed prediction. Abduljabbar R L et al.21 used simulation data to develop a Bi-LSTM short-term traffic prediction model, and the results showed that Bi-LSTM performed exceptionally well across multiple prediction horizons for all traffic variables. The above studies indicate that traffic data can be predicted through various machine learning models, but Bi-LSTM with the feature of capturing contextual information in both forward and backward directions has better prediction performance.

In summary, most research on traffic spillover focuses on real-time identification through detectors or spillover indices, with limited studies on traffic spillover prediction. Additionally, most traffic spillover control measures are implemented after spillover occurs, overlooking the possibility of spontaneous dissipation. Meanwhile, queue length is crucial for identifying and dissipating traffic spillover, and future queue lengths can be better predicted indirectly. On this basis, the Bi-LSTM model can be used to predict traffic spillover and its dissipation at short-distance intersections, providing a predictive method for the advanced management and control of traffic spillover at such intersections.

Method

Traffic spillover identification and dissipation criteria

Consider a short-distance intersection as shown in Fig. 1, where the upstream and the downstream intersection is denoted as O1 and O2, the internal distance is ln. The segment between O1 and O2 in the westbound direction is the segment where traffic spillover dissipation analysis is conducted (hereafter referred to as the “segment”), with a length of Ln. The signal cycle time at both O1 and O2 is C, with green light duration for westbound traffic being g1 and g2, respectively. The offset (i.e., the difference between the green light start times at O1 and O2, with the green light start time at O2 as the reference) is tf.

Fig. 1
Fig. 1
Full size image

Analysis of traffic spillover dissipation between short-distance intersections.

The queue length of stranded vehicles on the segment during the previous signal cycle at O2 is L. When the green light at O1 starts during the current cycle, vehicles depart at a speed of u1 toward the segment. When they encounter queuing vehicles on the segment, they stop and form a stopping wave with a wave speed of utw. During the current signal cycle at O2, the queuing vehicles on the segment begin to leave at a speed of u2, forming a starting wave with a wave speed of uqw. When the stopping wave and the starting wave meet at point A, the vehicle queue length on the segment reaches its maximum, Lmax. According to traffic wave theory34, formulas (1) to (3) hold at this point.

$$u_{tw} (t_{qs} + t_{f} - t_{1} ) = u_{qw} (t_{qs} - \frac{L}{{u_{qw} }})$$
(1)
$$t_{qs} = \frac{{L + (t_{f} - t_{1} )u_{tw} }}{{u_{qw} - u_{tw} }}$$
(2)
$$t_{1} = \frac{{L_{n} - L + l_{n} }}{{u_{1} }}$$
(3)

where tqs is the dissipation time of queuing vehicles on the road segment, and t1 is the time taken for the first vehicle entering from O1 after the green light at O1 to reach the end of the queuing vehicles on the road segment in Fig. 1.The maximum vehicle queue length Lmax for a road section can be calculated using formula (4)

$$L_{{{\text{max}}}} = t_{qs} \cdot u_{qw} = \frac{{L + (t_{f} - t_{1} )u_{tw} }}{{u_{qw} - u_{tw} }} \cdot u_{qw}$$
(4)

By substituting formula (3) into formula (4), the difference between Lmax and Ln is obtained \(\Delta L\), resulting in formula (5).

$$\Delta L = L_{{{\text{max}}}} - L_{n} = \frac{{L \cdot u_{1} + (L - l_{n} + t_{f} u_{1} )u_{tw} u_{qw} - L_{n} (u_{tw} u_{qw} + u_{1} (u_{qw} - u_{tw} ))}}{{u_{1} (u_{qw} - u_{tw} )}}$$
(5)

From formula (5), it can be understood that when \(\Delta L < 0\), traffic spillover will not occur in the current cycle; conversely, when \(\Delta L \ge 0\), it indicates that traffic spillover will occur (as shown in Fig. 1) and is closely related to the maximum queue length. Given that uqw > utw, if L increases and Ln decreases, \(\Delta L\) is more likely to exceed 0, thus increasing the likelihood of traffic spillover. When traffic spillover occurs, it can be determined whether the spillover will dissipate by checking if the last vehicle spilling before the red light at O1 can reach the safe distance Ls within the road section. This distance ensures that vehicles merging from other directions (taking a left turn north and a right turn south) during the red light time r1 for O1 in Fig. 1 will not spillover into O1. The relationship is expressed as follows.

$$L_{s} + \left( {Q_{NI} \cdot w_{1} + Q_{Sr} \cdot w_{2} } \right)H_{d} \le L_{n}$$
(6)

where QNl and QSr are the number of vehicles taking a left turn north and a right turn south toward O2 during the r1 at O1 in Fig. 1, respectively. w1 and w2 represent the proportion of vehicles in QNl and QSr that enter the road section, and Hd is the average headway between vehicles.

When the last vehicle involved in the traffic spillover before the red light at O1 in Fig. 1 reaches point B within the safe distance Ls, and no spillover occurs during the red light time r1 at O1, it indicates that the spillover can dissipate in time within the green light at O2. In this case, spillover control is not required; otherwise, spillover control is needed. To detect both traffic spillover and its dissipation, detectors 1 and 2 are placed at the safe distance and at the end of the road, respectively, as shown in Fig. 2. The vehicle occupancy states for the green and red lights at the i-th cycle at upstream intersection O1 in Fig. 1 are recorded as Sg1i, Sg2i, Sr1i, and Sr2i, respectively. These states are used to determine the spillover dissipation state Pi of the short-distance intersections during the i-th cycle.

Fig. 2
Fig. 2
Full size image

Layout of detectors and data collection points.

The values of Sg1i, Sg2i, Sr1i, and Sr2i are either 0 or 1. A value of 1 is assigned when a vehicle’s occupancy time at the detector reaches a certain threshold (e.g., 10 s), indicating that vehicles are queued up to the location of the detector; otherwise, the value is 0. Based on this, the determination and optimization strategy for the traffic spillover dissipation state Pi is shown in Table 1.

Table 1 Determination and optimization strategy for traffic spillover dissipation state.

In Table 1, the values of Pi are categorized into four scenarios: When Pi = 1, no traffic spillover occurs, so spillover control is not required. When Pi = 2, effectively dissipating traffic spillover occurs. This means that spillover happens only during the green light at O1 in Fig. 1 and dissipates before the green light ends. The queue length of vehicles remaining on the road is within the safe distance before the red light at O1 begins, so spillover control is not required.

When Pi = 3, it indicates the occurrence of potential non-dissipating traffic spillover, which means that traffic spillover happens during the green light period of O1 in Fig. 1. Before the red light starts, the queue length of stagnant vehicles from the spillover reaches or exceeds the safe distance. Although this queue length has not yet exceeded the road segment length Ln, there remains a potential spillover risk. Therefore, spillover control measures need to be implemented, and the signal timing plan can be appropriately adjusted.When Pi = 4, absolutely non-dissipating traffic spillover occurs. This means that spillover occurs during both the green and red times at O1 in Fig. 1 and cannot dissipate in time, requiring spillover control and a readjustment of the signal timing plan.

Through the setting of the above traffic spillover identification and dissipation discrimination conditions, it is convenient to determine whether spillover occurs between short-distance intersections and whether the spillover can dissipate, providing a basis for traffic spillover dissipation prediction.

Data collection approach and methods

Currently, commonly used spillover control methods mainly address the traffic spillover issue by adjusting three conditions: input flow rate15 signal cycle16, and offset18. These factors significantly impact the dissipation of traffic spillover, and when they change, traffic parameters such as the number of vehicles arriving and leaving at the road, average speed, and average density will also vary. Therefore, in this study, a simulation road network with the short-distance intersections was built in VISSIM 11. Different data collection schemes corresponding to varying input flow rates Q, cycle time C, and offset tf were set to obtain the required data.

As shown in Fig. 3, Q is set with 10 schemes, each simulating the peak variation of traffic flow during peak hours. For each scheme, 12 different flow values Qi are selected, with one value chosen every 6 cycles in sequence. C is set with 9 values: 80s, 90s, 100s, 110s, 120s, 130s, 140s, 150s, and 160s. tf is set with 9 values: -20s, -15s, -10s, -5s, 0s, 5s, 10s, 15s, and 20s. Using the above method, a total of 810 (10 × 9 × 9) data collection schemes can be formed, as shown in Fig. 3. Since the flow values Qi change every 6 cycles, each data collection scheme will generate data for 72 (12 × 6) cycles. In total, 58,320 (810 × 72) cycles of data can be collected.

Fig. 3
Fig. 3
Full size image

Data collection scheme setup.

To collect the required data, data collection points are first set at the end of the road segment and in front of the stop line (as shown in Fig. 2), which are used to collect the number of vehicles arriving at the road segment Q1i, the number of vehicles departing Q2i, the average speed vi, and the average density ki for each cycle. Based on the number of vehicles arriving and leaving collected each cycle, the number of vehicles stranded on the road Ri and the queue length of the stranded vehicles Li can be calculated. The calculation formulas are as follows:

$$R_{i} = Q_{1i} - Q_{2i} ,i = 1$$
(7)
$$R_{i} = R_{i - 1} + Q_{1i} - Q_{2i} ,i \ge 2$$
(8)
$$L_{i} = \frac{{R_{i} }}{{n_{L} }} \cdot H_{d}$$
(9)

where Ri and Ri-1 are the number of vehicles stranded on the road in the i-th and i-1-th cycles, respectively; Q1i and Q2i are the number of vehicles arriving at and leaving the road in the i-th cycle, respectively; Li is the queue length of the stranded vehicles on the road in the i-th cycle; Hd is the average headway between vehicles, typically taken as 7 m; and nL is the number of lanes on the road.

Based on formulas (7) and (8), it can be seen that if the number of vehicles stranded on the road in the first cycle, R1, is unknown, it is not possible to calculate the stranded vehicles for subsequent cycles, meaning the queue length of stranded vehicles for any future cycle cannot be directly computed. Therefore, it is necessary to collect Q1i and Q2i for each cycle and calculate Ri and Li according to formulas (7) to (9). The data set including Q1i, Q2i, Ri, Li, vi, and ki will be incorporated into the machine learning model as input data, which can be used to predict the queue length of stranded vehicles for a future cycle.

The occupancy states Sg1i, Sg2i, Sr1i, and Sr2i for each cycle must be collected through detector 1 and detector 2 in Fig. 2. Specifically, this can be achieved by developing a COM interface program for VISSIM 11, which will allow the calculation of the traffic spillover dissipation state Pi for each cycle using Table 1. These data will also be included in the input data set for the machine learning model, allowing it to further predict the traffic spillover dissipation state for a future cycle. Therefore, the final data to be collected in different data collection schemes, as shown in Table 2, will effectively enhance the interpretability of the prediction model.

Table 2 Data to be collected for different data collection schemes.

Through the setting of the above data collection ideas and methods, the data required in this paper and the reasons for collecting these data are intuitively displayed, which effectively improves the interpretability of the Bi-LSTM model and provides a data foundation for the subsequent construction of the prediction model.

Sensitivity analysis

Since Q, C and tf all have significant impacts on the traffic spillover dissipation state P of the road segment, this paper uses the local sensitivity analysis method36 to explore whether there is a causal relationship between the above factors and the traffic spillover dissipation state of the road segment, providing a basis for selecting input data for the traffic spillover dissipation prediction model. Drawing on the theory of this method, this paper takes Q, C, and tf as features and P as the target variable to analyze the influence of each feature on the target variable. The specific methods are as follows.

Taking the three factors Q, C, and tf as features, and using their current values Q0, C0 and tf0 as benchmark values respectively, the variation ranges of the features are set according to the benchmark values, as shown in formulas (10) to (12).

$$Q_{i} \in \left[ {Q_{0} - \Delta Q,Q_{0} + \Delta Q} \right]$$
(10)
$$C_{i} \in \left[ {C_{0} - \Delta C,C_{0} + \Delta C} \right]$$
(11)
$$t_{fi} \in \left[ {t_{f0} - \Delta t_{f} ,t_{f0} + \Delta t_{f} } \right]$$
(12)

where Qi, Ci, and tfi are the values of Q, C, and tf in the i-th cycle, and \(\Delta\) Q, \(\Delta\) C, and \(\Delta\) tf are the variations in Q, C, and tf, respectively.

Under the premise that other features remain at their baseline values, a specific feature takes different values within its range of variation, and the corresponding traffic spillover dissipation state Pi is obtained. For example, when C and tf are set to their baseline values C0 and tf0, and Q takes different values Qi within its range, the data Sg1i, Sg2i, Sr1i, and Sr2i are output by the VISSIM 11, from which the corresponding values of Pi can be derived. This allows for an analysis of the impact of each factor on the traffic spillover dissipation state.

Therefore, referring to the traffic operation conditions of actual short-distance intersections, this paper first sets the current values of the three features—input Q, C, and tf—are set to 1300 veh/h, -5s, and 100s, respectively. When Q is analyzed as a single factor, C and tf are fixed at their current values. The value of Q is set according to Scheme 3 in Fig. 3, ranging from 900 veh/h to 1400 veh/h, and then decreased to 800 veh/h, with a change of 100 veh/h every 600 s (6 cycles). The simulation runs for 7200 s. Similarly, when C is analyzed as a single factor, Q and tf are fixed at their current values, and C is set to range from 80 to 160s, with each cycle running 30 times. When tf is analyzed as a single factor, Q and C are fixed at their current values, and tf is set to range from -20s to 20s, with an increment of 5s every 3600 s (36 cycles). The simulation runs for 32,400 s. Then, the required data, similar to the data in Table 3, are collected and calculated from the VISSIM 11 for each cycle, enabling the analysis of the impact of each feature on the P.

Table 3 Partial collected data.

Through the above sensitivity analysis setup, it is convenient to determine whether the three features Q, C, and tf affect traffic spillover dissipation and the magnitude of their impacts, providing ideas for the design of data collection schemes.

Two-stage prediction model of traffic spillover dissipation based on Bi-LSTM

Bi-LSTM model theory

Bi-LSTM has strong sequence modeling capabilities, allowing it to model and predict the time series features of traffic data. Bi-LSTM is composed of two LSTM units in opposite directions, and each unidirectional LSTM unit mainly includes a forget gate, a memory gate, and an output gate, which enables it to better process information with long time spans. The structure is shown in Fig. 4.

Fig. 4
Fig. 4
Full size image

Structure of unidirectional LSTM unit.

Assume the input data series is x1, x2, …, xt and the output signals are h1, h2, …, ht. The relevant formulas are as follows.

$$f_{t} = \sigma \left( {W_{f} \cdot \left[ {h_{t - 1} ,x_{t} } \right] + b_{f} } \right)$$
(13)
$$i_{t} = \sigma \left( {W_{i} \cdot \left[ {h_{t - 1} ,x_{t} } \right] + b_{i} } \right)$$
(14)
$$\tilde{c}_{t} = {\text{tanh(}}W_{c} \cdot [h_{t - 1} ,x_{t} ] + b_{c} )$$
(15)
$$c_{t} = f_{t} c_{t - 1} + i_{t} \tilde{c}_{t}$$
(16)
$$o_{t} = \sigma (W_{o} \cdot [h_{t - 1} ,x_{t} ] + b_{o} )$$
(17)
$$h_{t} = o_{t} \tanh (c_{t} )$$
(18)

where ft is the forget gate, it is the input gate, ct is the updated cell state, and ot, ht are the output gate. Wf, bf, Wi, bi, Wc, bc, Wo and bo denote the neural network parameters. σ() is the sigmoid neural network layer, and tanh() is the tanh neural network layer and the tanh activation function in Eqs. (15) and (18), respectively.

Since the Bi-LSTM model is composed of two LSTM units in opposite directions, it can process time-series data along both forward and backward time steps, fully considering information from historical and future moments. By integrating information from preceding and succeeding moments, it can better capture the dynamic changes in traffic data. The model structure is shown in Fig. 5. The structures of the Bi-LSTM model are intuitively displayed through Fig. 4 and Fig. 5, which improves the interpretability of the prediction model.

Fig. 5
Fig. 5
Full size image

Bi-LSTM model structure.

The forward implicit vector \(\overrightarrow {h}_{t - 1}\) generates a new implicit vector \(\overrightarrow {h}_{t}\), and the backward implicit vector \(\overleftarrow {h}_{t - 1}\) generates a new implicit vector \(\overleftarrow {h}_{t}\). By combining the output results of both the forward and backward input sequences, we obtain Yt. The specific calculation is as follows.

$$\overrightarrow {h}_{t} = {\text{LSTM}}\left( {x_{t} ,\overrightarrow {h}_{t - 1} } \right)$$
(19)
$$\overleftarrow {h}_{t} = {\text{LSTM}}\left( {x_{t} ,\overleftarrow {h}_{t - 1} } \right)$$
(20)
$$Y_{t} = \tanh \left( {w_{{\overrightarrow {h} y}} \overrightarrow {h}_{t} + w_{{\overleftarrow {h} y}} \overleftarrow {h}_{t} + b_{y} } \right)$$
(21)

where \(\overrightarrow {h}_{t}\) and \(\overleftarrow {h}_{t}\) are the outputs of the forward and backward hidden layers at time step t, respectively; \(w_{{\overrightarrow {h} y}}\) and \(w_{{\overleftarrow {h} y}}\) represent the weight matrices connecting each layer to the previous hidden state; and by denotes the bias term.

Data input and output of the prediction model

The commonly used model prediction method involves directly outputting the prediction results after inputting the data (i.e., single-stage prediction). However, as shown in Sections "Traffic spillover identification and dissipation criteria" and "Data collection approach and methods" of this paper, the queue length of stranded vehicles affects the traffic spillover situation on the road, and the queue length of stranded vehicles in a future cycle cannot be directly calculated or obtained. Therefore, we propose a two-stage traffic spillover dissipation prediction model based on Bi-LSTM. In the first stage, the Bi-LSTM model is used to predict the queue length of stranded vehicles on the road. After verifying that the first stage prediction results meet the required criteria, these results are used as one of the feature inputs for the second stage prediction. Combined with other input data in the second stage to jointly predict the traffic spillover dissipation state of the road segment, this approach enhances the interpretability of the model and improves its prediction accuracy.

The Bi-LSTM prediction model used in this paper consists of two stages of data input and output, where the input data sets for both stages, X1i and X2i, include feature data and destination data. As shown in Fig. 6, in the first stage, the feature data of X1i include the data Qi, Ci, and tfi set for different cycles, as well as the resulting traffic parameter data Q1i, Q2i, vi, ki, and Ri, where Ri is calculated using formulas (7) and (8). The destination data is the queue length of stranded vehicles Li for each cycle, calculated using formula (9), and the output data is the model’s predicted queue length of stranded vehicles Lj for a future cycle. In the second stage, the feature data of X2i are based on the data from the first stage, excluding Ri, and further include the predicted Lj from the first stage model and the occupancy state Sg1i, Sg2i, Sr1i, and Sr2i of detector 1 and detector 2 from Fig. 2. The destination data for this stage is the traffic spillover dissipation state Pi for each cycle (as derived from Table 1), and the output data is the model’s predicted traffic spillover dissipation state Pj for a future cycle.

Fig. 6
Fig. 6
Full size image

Data input and output of the two-stage prediction model.

Model training

Based on the principles of the above Bi-LSTM model and the data input and output, the training process of the Bi-LSTM-based two-stage traffic spillover dissipation prediction model is shown in Fig. 7. The specific process is as follows.

  1. (1)

    Data setup: Traffic data for N cycles is collected using the VISSIM 11 to form the two-stage input data sets X1i and X2i. Then, based on the content of Section "Data input and output of the prediction model", the feature data, destination data, and output data in the two-stage prediction model are set, and they are split into training and testing sets at an 8:1 ratio.

  2. (2)

    K-fold cross-validation: To ensure the reliability of the model, the dataset is randomly divided into 10 subsets of similar size. Each time, one subset is selected as the test set, and the remaining 9 subsets are used as the training set. This allows for 10 rounds of training and testing, generating 10 model performance metrics. The average and standard deviation of these metrics are then calculated to evaluate the model’s stability and generalization ability.

  3. (3)

    Hyperparameter tuning: The Random Search method is employed to optimize the hyperparameters of the Bi-LSTM model, such as the number of epochs (training iterations over the entire dataset) and batch size (number of samples per gradient update). Specifically, a predefined number of hyperparameter combinations are randomly sampled from the specified search space. Each combination is used to train the model, and its performance is evaluated on a validation set. The combination yielding the best performance is selected.

  4. (4)

    Parameter adjustment: Set the initial parameters of the Bi-LSTM model, and adjust parameters such as epoch (number of training samples) and batch size through experience and multiple trials, until the adjustment is completed when R2 is greater than or equal to 0.9.

  5. (5)

    Determination of evaluation index: This paper selects commonly used evaluation index for predictive models: Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R2). However, since the real-world data contains zeros, which would lead to division by zero in MAPE calculations, MAE, RMSE, and R2 are used as the primary index to assess the regression performance. Specifically, smaller MAE and RMSE values indicate lower prediction errors, while a larger R2 value signifies a better fit of the model. The specific formulas are as follows.

    $$MAE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {y_{i} - p_{i} } \right|} \in [0, + \infty ]$$
    (22)
    $$R^{2} = 1 - \frac{{\sum\limits_{i = 1}^{n} {(y_{i} - p_{i} )^{2} } }}{{\sum\limits_{i = 1}^{n} {(y_{i} - \overline{y} )^{2} } }} \in [0,1]$$
    (23)
    $$RMSE = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {(y_{i} - p_{i} )^{2} } } \in [0, + \infty ]$$
    (24)
    Fig. 7
    Fig. 7
    Full size image

    Bi-LSTM model two-stage training flow chart.

    Where n is the number of samples, yi is the true value of the i-th sample, pi is the predicted value of the i-th sample, and ȳ is the average of the true values of the samples.

  6. (6)

    Data grouping: Since traffic flow has a certain level of continuity, traffic flow characteristics are generally stable over 10 to 15 min. Using a 100-s cycle for measurement, there are 6 cycles in 10 min. Therefore, predicting the destination data for the 6th cycle using the input data from the previous 5 cycles yields good results. Thus, every 6 cycles of input data X1i or X2i are grouped together, and each stage is divided into N-5 groups, where i takes values of j-5, j-4, j-3, j-2, j-1, with 6 ≤ j ≤ N, where j is the cycle number.

  7. (7)

    Model training: In the first stage, the grouped X1i is input into the Bi-LSTM model. The training is done using the data from the first to the fifth cycle of the first input group. After training, the output is compared with the actual data of the 6th cycle. If the model error, measured by R2 or RMSE, meets the requirements, the next group of input data is trained. This continues until all N-5 groups of input data have been trained. The model will then output the predicted road segment queue length Lj for a future cycle. In the second stage, after the model has completed training in the first stage, the output Lj from the first stage is incorporated as feature data. Similarly, the grouped X2i is input into the Bi-LSTM model. Training begins with the first input group and continues until all N-5 groups have been trained. The model will then output the predicted road traffic spillover dissipation state Pj for a future cycle.

Through the above settings of model principles, data input and output, evaluation index, and model training, it is convenient to understand the composition of the two-stage prediction model for traffic spillover dissipation based on Bi-LSTM, providing a theoretical foundation for the construction of subsequent prediction models.

Results

Data collection

Set up a short-distance intersection simulation network in the VISSIM 11 as shown in Fig. 8. The segment length L is 150m, and Ls is 80m, with the driving direction heading west. Two data collection points and two types of detectors are set up on this section to collect the data required for the prediction model.

Fig. 8
Fig. 8
Full size image

Short-distance intersection simulation network.

The data corresponding to each scenario are collected according to the data collection method described in Section "Data collection approach and methods". Taking one data collection scenario as an example, Scenario 6 is selected for Q, with a corresponding tf of -5 s and C of 100 s. Partial data obtained from this scenario are shown in Table 3.

Sensitivity analysis

Results of sensitivity analysis for Q

Using Q as the factor, the simulation data obtained above is used to calculate the probabilities α corresponding to each value of Pi (1, 2, 3, 4) over every 6 cycles, as well as the average values of road section queue length L, speed v, and density k. The sensitivity analysis of Q to P is shown in Fig. 9. The sum of the probabilities of Pi taking values 2, 3, and 4 is defined as the probability β of traffic spillover occurring on the road, while the sum of the probabilities of Pi taking values 3 and 4 is defined as the probability γ of non-dissipating traffic spillover on the road. The correlation analysis of Q, L, v, k with β and γ is shown in Fig. 10.

Fig. 9
Fig. 9
Full size image

Sensitivity analysis of Q with respect to P.

Fig. 10
Fig. 10
Full size image

Correlation analysis of Q, L, v, k with P.

As shown in Fig. 9, with the change in the value of Q, β generally exhibits a positive correlation and co-variation with Q. However, β does not change significantly immediately; instead, it changes after Q increases or decreases to a certain extent. Meanwhile, L, v, and k also vary with the value of Q. At the same time, the larger L is, the smaller v is and the larger k is, the larger β becomes.

As shown in Fig. 10, the correlation coefficient R between Q and β, γ are approximately 0.95358 and 0.968, respectively; the correlation coefficient R between L and β, γ are approximately 0.97996 and 0.92704, respectively; the correlation coefficient R between v and β, γ are approximately -0.99126 and -0.93779, respectively; the correlation coefficient R between k and β, γ are approximately 0.98969 and 0.94062, respectively. When the number of observed data points n = 12, the correlation coefficient test table35 shows that Rn-2 = 0.576. Since the correlation coefficient R between Q, L, k and β, γ are all positive and above 0.927, which are greater than Rn-2, there is a significant positive correlation between Q, L, k and β, γ. On the other hand, since the correlation coefficient R between v and β, γ are negative and have absolute values greater than 0.93, which are also greater than Rn-2, there is a significant negative correlation between v and β, γ. This indicates that under the variation of Q, Q, L, v, and k all have a significant impact on the traffic spillover dissipation state P of the road.

Results of sensitivity analysis for C

Using C as the factor, the simulation data obtained above is used to calculate the probabilities α for each value of Pi over 30 cycles, as well as the average values of L, v, and k. The sensitivity analysis of C to P is shown in Fig. 11. The correlation analysis of C, L, v, k with β and γ is shown in Fig. 12.

Fig. 11
Fig. 11
Full size image

Sensitivity analysis of C with respect to P.

Fig. 12
Fig. 12
Full size image

Correlation analysis of C, L, v, k with P.

As shown in Fig. 11, with the increase of C, β undergoes small fluctuating changes and reaches its minimum at a certain ideal cycle. At the same time, as L increases, v decreases, and k increases, β also increases.

As shown in Fig. 12, the correlation coefficient R between C and β, γ are approximately -0.67937 and -0.85272, respectively; the correlation coefficient R between L and β, γ are approximately 0.85326 and 0.72292, respectively; the correlation coefficient R between v and β, γ are approximately -0.89829 and -0.87268, respectively; the correlation coefficient R between k and β, γ are approximately 0.90254 and 0.9581, respectively. When the number of observed data points n = 9, Rn-2 = 0.666. Since the correlation coefficient R between L and β, γ are positive and above 0.666, and the correlation coefficient R between k and β, γ are positive and above 0.902, all of which are greater than Rn-2, there is a positive correlation between L and β γ, and a significant positive correlation between k and β, γ. Since the correlation coefficient R between C, v and β, γ are negative and their absolute values are above 0.679, which are also greater than Rn-2, there is a negative correlation between C, v and β, γ. This indicates that under the variation of C, C, L, and v all have some impact on P, and k has a larger impact on P.

Results of sensitivity analysis for t f

Using tf as the factor, the simulation data obtained above is used to calculate the probabilities α for each value of Pi over 36 cycles, as well as the average values of L, v, and k. The sensitivity analysis of tf to P is shown in Fig. 13. The correlation analysis of tf, L, v, k with β and γ is shown in Fig. 14.

Fig. 13
Fig. 13
Full size image

Sensitivity analysis of tf with respect to P.

Fig. 14
Fig. 14
Full size image

Correlation analysis of tf, L, v, k with P.

As shown in Fig. 13, with the variation of tf, β undergoes significant fluctuations and reaches its minimum at a certain ideal offset. At the same time, as L increases, v decreases, and k increases, β also increases.

As shown in Fig. 14, the correlation coefficient R between tf and β, γ are approximately 0.69052 and 0.73042, respectively; the correlation coefficient R between L and β, γ are approximately 0.94465 and 0.93939, respectively; the correlation coefficient R between v and β, γ are approximately -0.67771 and -0.71087, respectively; the correlation coefficient R between k and β, γ are approximately 0.68146 and 0.71386, respectively. When the number of observed data points n = 9, Rn-2 = 0.666. Since the correlation coefficient R between tf and β, γ, and between k and β, γ, are all positive and above 0.69, and the correlation coefficient R between L and β, γ are positive and above 0.939, all of which are greater than Rn-2, there is a positive correlation between tf, k and β, γ, and a significant positive correlation between L and β, γ. Since the correlation coefficient R between v and β, γ are negative and have absolute values above 0.677, which are also greater than Rn-2, there is a negative correlation between v and β, γ. This indicates that under the variation of tf, tf, v, and k all have some impact on P, and L has a larger impact on P.

In summary, changes in Q, C and tf affect P of the road segment, exhibiting a certain causal or correlational relationship. According to the principle that a larger absolute value of the correlation coefficient indicates a closer relationship between features and the target variable, Q has the highest feature importance. Additionally, the road traffic parameters L, v, and k generated during this process also have varying degrees of impact on P. This further confirms that using the above data as the input for the traffic spillover dissipation prediction model for short-distance intersections proposed in this paper is reasonable.

Analysis of prediction results

Analysis of the prediction model results for queue length of stranded vehicles

In this paper, a traffic spillover dissipation prediction model based on Bi-LSTM is constructed using a Python program according to the content in Sect. "Two-stage prediction model of traffic spillover dissipation based on Bi-LSTM". The first stage of the model prediction is performed using Bi-LSTM, which predicts the queue length of stranded vehicles Lj in a future cycle. The prediction results are compared with those obtained using other machine learning algorithms such as DT, CNN, RF, LSTM and GRU. Taking tf = -5s and C = 100s as an example, Q is selected from Scheme 1 to 10, and a total of 720 data points are collected. The prediction results and evaluation index of different models are compared and analyzed using 80 corresponding test set data, as shown in Fig. 15 and Table 4.

Fig. 15
Fig. 15
Full size image

Comparison of prediction results from different models.

Table 4 Comparison of evaluation index for different models.

As shown in Fig. 15, it displays the comparison between the predicted value and true value of the queuing length of stranded vehicles output by each model. However, due to the dense prediction results of each model, a partial line chart is amplified for easy observation. As seen in Fig. 15, compared to the DT, CNN, and RF models, the GRU and LSTM model are more sensitive to time series data, and its predicted results are closer to the true values. Additionally, as shown in Table 4, the R2, RMSE, and MAE of the DT model are 0.941, 7.369 and 6.188, respectively; those of the CNN model are 0.949, 6.825 and 5.663, respectively; those of the RF model are 0.956, 6.348 and 5.200, respectively; those of the GRU model are 0.963, 5.942 and 4.800, respectively; and those of the LSTM model are 0.963, 5.831 and 4.725, respectively. Since a larger R2 indicates a higher model fitting degree, and smaller RMSE and MAE indicate smaller model errors, the fitting degree and error performance of the GRU and LSTM models are superior to those of the DT, CNN and RF models. Additionally, since a larger R2 indicates a higher model fit and a smaller RMSE indicates smaller model errors, Table 4 shows that the GRU and LSTM model outperform the DT, CNN and RF models in both fit and error. Compared to the LSTM prediction model, the Bi-LSTM model can capture both forward and backward contextual information in the time series data, making it superior to the LSTM model, which can only capture data in a forward direction.

Therefore, it can be seen from Fig. 15 that the prediction results of the Bi-LSTM model are closest to the true values. Meanwhile, as shown in Table 4, the R2, RMSE, and MAE of the Bi-LSTM model are 0.968, 5.403, and 4.300, respectively, indicating that the Bi-LSTM model has the highest fitting degree and the smallest prediction error. The above comparison results demonstrate that based on the LSTM model, the further adoption of the Bi-LSTM model to predict the queuing length of stranded vehicles on road segments is effective, and its results are optimal.

To further analyze the prediction accuracy of the above Bi-LSTM model, the predicted values of Lj output by the Bi-LSTM model were randomly ranked and compared with the true value, and the error distribution map and a unilinear analysis map were plotted, as shown in Fig. 16 and Fig. 17.

Fig. 16
Fig. 16
Full size image

Predictive value and error distribution.

Fig. 17
Fig. 17
Full size image

Unitary linear analysis.

In Fig. 16, the blue scatter points represent the predicted values of Lj from the model’s test set, while the green bands indicate the error distribution between the predicted and actual values of Lj. A narrower green band signifies a smaller prediction error; for instance, the error for the first 0–11 data points is zero. As Lj increases, errors emerge, yet the difference between the predicted and actual values of Lj does not exceed 9m, with a maximum error rate of 14.29%. According to statistics, the average error rate between the predicted and actual values of Lj is 6.6%. As illustrated in Fig. 17, the equation y = 0.992x + 0.517 represents the fitting straight line function between the predicted and actual values of Lj. The slope k of x is 0.992, and the intercept b is 0.517. The closer the slope k is to 1 and the intercept b is to 0, the closer the fitting line is to y = x, indicating that the true values and predicted values are closer. Therefore, through the above analysis, it can be concluded that the two-stage traffic spillover dissipation prediction model based on Bi-LSTM proposed in this paper has high accuracy in predicting the road queue length of stranded vehicles in the first stage.

Analysis of the traffic spillover dissipation prediction model results

Based on the completion of the first stage of the model prediction, the predicted Lj from the first stage is used as feature data for the second stage, continuing with the prediction for the second stage, i.e., predicting the traffic spillover dissipation state Pj for a future cycle. To better validate the effectiveness of the two-stage traffic spillover dissipation prediction model based on Bi-LSTM proposed in this paper, a single-stage traffic spillover dissipation prediction model was also set up. In this model, after inputting feature and destination data, the model can directly predict Pj without using the Lj predicted in the first stage as feature data for the second stage. The prediction results and evaluation index of this model are compared with those of the two-stage prediction model, as shown in Fig. 18 and Table 5, where the test data is randomly selected from 800 data points from the test set corresponding to all collected data.

Fig. 18
Fig. 18
Full size image

Comparison of prediction results between single-stage and two-stage prediction models.

Table 5 Comparison of evaluation index between single-stage and two-stage prediction models.

Figure 18 records the prediction scenarios of the single-stage model and the two-stage model, where x–y indicates that when the true value of Pj is x, the predicted value output by the model is y. Since Pj has four possible prediction outcomes: Pj = 1 (no traffic spillover), Pj = 2 (effectively dissipating traffic spillover), Pj = 3 (potentially non-dissipating traffic spillover), and Pj = 4 (absolutely non-dissipating traffic spillover), both x and y can take values of 1, 2, 3, or 4, resulting in 16 possible prediction scenarios: 1–1, 1–2, 1–3, 1–4, 2–1, 2–2, 2–3, 2–4, 3–1, 3–2, 3–3, 3–4, 4–1, 4–2, 4–3, and 4–4. However, only 9 scenarios are included in the randomly selected 800 pieces of data. For the single-stage prediction model, the scenarios of 1–1, 2–2, 3–3, 4–4, 1–2, 2–1, 3–2, 3–4, and 4–3 occurred 279 times, 117 times, 128 times, 162 times, 26 times, 35 times, 13 times, 10 times, and 30 times, respectively. For the two-stage prediction model, these scenarios occurred 294 times, 132 times, 141 times, 176 times, 11 times, 20 times, 8 times, 2 times, and 16 times, respectively. Statistically, the accuracy of traffic spillover identification for the single-stage and two-stage prediction models (the accuracy when Pj takes the values of 1, 2, 3, and 4) is 85.75% and 92.88%, respectively. The accuracy of traffic spillover dissipation prediction (the accuracy when Pj takes the values of 2, 3, and 4) is 82.88% and 90.71%, respectively. As shown in Table 5, the R2, RMSE and MAE for the single-stage prediction model are 0.901, 0.377 and 0.138, respectively, while the R2, RMSE and MAE for the two-stage prediction model are 0.951, 0.267 and 0.075, respectively. The above comparison results show that the two-stage prediction model is superior and is an effective improvement over the single-stage prediction model.

To further analyze the model’s prediction performance in different traffic flow intervals, we continue using tf = -5s and C = 100s as examples. The Q values are selected from schemes 1 to 10, and a total of 720 data points are collected. From these, 80 data points are randomly selected from the following 8 traffic flow intervals: [725,900], (900,1000], (1000,1100], (1100,1200], (1200,1300], (1300,1400], (1400,1500], and (1500,1575], with 10 data points selected from each interval for testing. The prediction results for each flow interval are shown in Fig. 19.

Fig. 19
Fig. 19
Full size image

Prediction results of single-stage and two-stage models in different traffic flow intervals.

As shown in Fig. 19, it displays the comparison between the true values of Pj in different traffic flow intervals and the predicted values from the single-stage model and the two-stage model. Red circles indicate where prediction deviations exist. As shown in Fig. 19, compared to the single-stage prediction model, the two-stage prediction model provides more accurate results with less deviation. Except for the traffic flow intervals of (1100,1200], (1200,1300], and (1300,1400], where a few predicted values do not match the true values, the predicted values for all other flow intervals are identical to the true values.

To further analyze the relationship between the predicted values of Pj in the second stage of the two-stage prediction model and the true values, the true and predicted values corresponding to different traffic flow intervals in Fig. 19 are recorded. The occurrence times of different scenarios is then statistically analyzed, as shown in Fig. 20. In Fig. 20, the colors for 1–1, 2–2, 3–3, and 4–4 are all in the green range, indicating accurate predictions; the colors for 2–1 and 4–3 are in the purple range, indicating deviations in the predictions. Other scenarios with a time of 0 are not displayed in the figure.

Fig. 20
Fig. 20
Full size image

Analysis of prediction scenarios in different traffic flow intervals.

As shown in Fig. 20, the entire traffic flow interval can be divided into three stages. Stage 1 and Stage 3 both have accurate predictions, while Stage 2 has 3 instances of inaccurate predictions. A detailed analysis is as follows.

Stage 1: When the traffic flow is in the [725,900] and (900,1000] intervals, the flow is relatively low, with no traffic spillover occurring, and the traffic flow is quite stable. The prediction model can accurately predict the state of no traffic spillover on the road, with no prediction errors.

Stage 2: When the traffic flow gradually increases, the number of traffic spillover occurrences on road segments in the flow intervals of (1100, 1200], (1200, 1300], and (1300, 1400] begins to increase continuously, leading to some critical situations. For example, when the upstream intersection has a green light, if detector 1 is occupied and the tail of the queuing vehicles on the road segment is near detector 2, Pi may be detected as 1 or 2; when the upstream intersection has a red light, if detector 1 is occupied and the tail of the queuing vehicles is near detector 2, Pi may be detected as 3 or 4 (as shown in Fig. 21), with the corresponding prediction scenario being 4–3. In this stage, the prediction results for traffic spillover dissipation are not completely accurate, but such deviations occur infrequently and have a negligible impact on the overall prediction results.

Fig. 21
Fig. 21
Full size image

Simulation scenario corresponding to the prediction case of 4–3.

Stage 3: When the flow increases to the (1400, 1500] and [1500, 1600] intervals, the number of instances of traffic spillover that cannot dissipate in time increases significantly, while the number of instances with no traffic spillover decreases sharply. In this stage, the traffic spillover state recognition reaches a stable state, and the prediction model can accurately predict whether traffic spillover will occur in a given future period, as well as whether the spillover will dissipate in time, without any prediction deviation.

The model prediction results are verified through simulation. Taking tf = -10s, C = 100s, and Q using Scheme 1 as an example, the traffic spillover dissipation state for the 33rd period is predicted, with data from periods 28 to 32 used as input. The model predicts that the traffic spillover in the 33rd period will dissipate effectively (i.e., Pj = 2), and in the actual simulation, the traffic spillover in the 33rd period also dissipates effectively, as shown in Figs. 22 and 23. In Fig. 22, traffic spillover occurs during the green light time for westbound traffic at the upstream intersection, and in Fig. 23, when the red light for westbound traffic begins at the upstream intersection, the queuing vehicles are within the safe distance, indicating that the traffic spillover can dissipate in time. Through validation using the VISSIM 11, the effectiveness of the two-stage traffic spillover dissipation prediction model for short-distance intersections proposed in this paper is further confirmed.

Fig. 22
Fig. 22
Full size image

Traffic spillover occurs during the green light time for westbound traffic at the upstream intersection.

Fig. 23
Fig. 23
Full size image

Traffic spillover dissipates when the red light starts for westbound traffic at the upstream intersection.

Discussion

In terms of data collection, the traffic spillover identification and dissipation judgment conditions for short-distance intersections are proposed combined with traffic wave theory. Based on this, VISSIM 11 is used to construct simulation road networks for two short-distance intersections. By setting different input Q, C, and tf corresponding to different data collection schemes, data such as the number of Q1i, Q2i, Ri, Li, vi, ki, Sg1i, Sg2i, Sr1i and Sr2i, and Pi for each cycle were obtained. Machine learning models are then used to predict traffic spillover dissipation between road at short-distance intersections, which helps improve the interpretability of the prediction model.

In terms of the feasibility verification of the Bi-LSTM model selection, we construct six machine learning models—DT, CNN, RF, GRU, LSTM and Bi-LSTM—using Python to predict the queue length of stranded vehicles (Lj) for a future cycle in the first stage, followed by a comparative analysis. The results show that the GRU and LSTM model, which are more sensitive to time series data, outperforms the DT, CNN, and RF models. Compared with the DT, CNN and RF models, the MAE of the LSTM model decreased by 23.64%, 16.56% and 9.13%, respectively; the RMSE decreased by 20.87%, 14.56% and 8.14%, respectively; and the R2 increased by 2.34%, 1.48% and 0.73%, respectively. For the GRU model, the MAE decreased by 22.43%, 15.24%, and 7.69%, respectively; the RMSE decreased by 19.36%, 12.94% and 6.40%, respectively; and the R2 increased by 2.23%, 1.48%, and 1.26%, respectively. Although the MAE, RMSE and R2 values of the GRU and LSTM models are quite close, the LSTM model slightly outperforms the GRU model. The Bi-LSTM model, which can capture contextual information of time-series data in both forward and backward directions, outperforms the GRU and LSTM models that can only capture data in the forward direction. Compared with the GRU and LSTM models, its MAE decreases by 10.42% and 8.99%, respectively; the RMSE decreases by 9.07% and 7.34%, respectively; and the R2 increases by 0.62% and 0.52%, respectively. This indicates that the Bi-LSTM model has higher prediction accuracy, and it is feasible to use the Bi-LSTM model in this study to predict traffic spillover dissipation at short-distance intersections. Compared with Reference18, which estimates the queuing length under spillover conditions based on low-resolution detector data with an average error rate of 14.9% (prediction accuracy of 85.1%), and Reference15, which first predicts traffic flow and then uses the predicted flow in a queue length calculation model to obtain real-time prediction with an average error rate of 14.2% (prediction accuracy of 85.8%), this study uses traffic wave theory and the initial stranded vehicle queue length formula to identify traffic parameters related to road segment queue length, and collects these parameters as inputs to the prediction model. This improves the interpretability of the prediction model. Therefore, the average error rate of the road segment stranded vehicle queue length predicted by the Bi-LSTM model in the first stage is reduced to 14.2% (prediction accuracy of 85.8%), and the model can predict queue length both under non-spillover and spillover conditions. These results indicate that the Bi-LSTM model used in this paper meets the accuracy requirements for predicting the queue length of stranded vehicles in the first stage and provides a solid foundation for the subsequent prediction of road traffic spillover dissipation.

Building on this, a two-stage prediction method for traffic spillover dissipation based on Bi-LSTM is proposed , where the Lj predicted in the first stage is used as one of the feature data inputs for the second stage to further predict the traffic spillover dissipation state Pj for a future cycle at short-distance intersections. To verify the effectiveness of the two-stage prediction method, both a single-stage prediction model and a two-stage prediction model are constructed to predict Pj. The results show that compared with the single-stage prediction model, the MAE of the two-stage prediction model is reduced by 45.45%, the RMSE is reduced by 29.18%, and the R2 is increased by 5.55%. This indicates that the evaluation index of the two-stage prediction model are all superior to those of the single-stage prediction model, verifying that the two-stage prediction model proposed in this paper is conducive to further improving the prediction accuracy of Pj. Compared with the method proposed in reference7 for identifying traffic spillover through upstream fixed detector occupancy data, which was verified by simulation to have a spillover identification accuracy rate of 90.45%, this study sets two-stage data input and output based on traffic wave theory, detector data, and sensitivity analysis, enhancing the interpretability of the model. The proposed two-stage traffic spillover dissipation prediction model based on Bi-LSTM can both identify traffic spillover and predict traffic spillover and its dissipation state. The model achieves an accuracy rate of 92.88% in traffic spillover identification and 90.72% in traffic spillover dissipation state prediction, demonstrating good prediction accuracy. It effectively realizes traffic spillover identification and dissipation prediction, which can provide more targeted optimization and control strategies for traffic spillover problems at short-distance intersections and help improve the traffic capacity of short-distance intersections.

Conclusions

Focusing on the issue of traffic spillover that is often ignored in current studies, particularly the problem where traffic spillover can dissipate on its own, this paper conducts research on traffic spillover dissipation prediction at short-distance intersections. First, the conditions for identifying traffic spillover and determining dissipation at short-distance intersections are proposed combined with traffic wave theory. Based on this, the VISSIM 11 is used to construct traffic operation and data collection scenarios for short-distance intersections. Next, the necessary input data for the prediction model is collected and obtained using methods to calculate the length of vehicle queuing and the traffic spillover dissipation state of the road segments, thus improving the interpretability of the model. Finally, a two-stage traffic spillover dissipation prediction model is constructed using the Bi-LSTM model. The first stage’s predicted vehicle queue length is used as feature data for the second stage, effectively improving the model’s prediction accuracy.The results show that the model constructed using Bi-LSTM outperforms DT, CNN, RF, GRU and LSTM models in predicting the vehicle queue length in the first stage, with a prediction accuracy of 93.4%, validating the feasibility of selecting the Bi-LSTM model. In the second stage, the model’s predictions for the road segment’s traffic spillover dissipation state outperform the single-stage prediction model. The model achieves an accuracy of 92.88% in traffic spillover identification and 90.72% in traffic spillover dissipation state prediction, confirming that the proposed two-stage Bi-LSTM model for traffic spillover dissipation is effective and can further improve prediction accuracy. The proposed prediction method can identify traffic spillover at short-distance intersections in advance and, based on this, further predict whether the spillover can dissipate in time. This approach is helpful for selecting targeted signal control strategies to solve the traffic spillover issue at short-distance intersections, thereby effectively improving the traffic capacity of such intersections.

Due to conditional limitations, this study is based on data collected in a traffic simulation environment. The performance of the prediction model when applied to real-world traffic data with more noise and variability remains to be further tested and validated. Therefore, follow-up research will further explore how to adapt to noise in real-world traffic data, how to integrate other data sources (such as weather, socioeconomic data or public transportation schedules) into the model to improve prediction performance, how to integrate the model into existing traffic management systems, how to enhance the model’s generalization and scalability, and how to apply it to more traffic scenarios—ultimately extending to practical application scenarios. Additionally, considerations will include the impact of local adjustments on the overall network, the influence of adversarial manipulation, and applications in different cities or countries with varying traffic behaviors and infrastructure designs. In the future, this study will further consider emerging models such as Transformer37 to enhance the ability to capture global information, address more complex traffic scenarios and task requirements, and continuously improve model performance.

In the future, we will also actively collect real-world traffic data including the behavioral characteristics of emergency vehicles, pedestrians, and vulnerable road users, as well as sudden accident data, to supplement the model. Additionally, we will introduce simulation data of more complex scenarios for model training to enhance the model’s adaptability and robustness, while formulating specialized emergency response mechanisms.

In addition, we will further conduct in-depth analyses of scenarios where the model performs poorly. By optimizing the model structure, increasing training with more real-world scenario data, and other means, we will improve the model’s prediction accuracy in complex situations, reduce the risks associated with model failure, and ensure the reliability and safety of the model in traffic management.