Abstract
Traditional trajectory compression algorithms, such as the siliding window (SW) algorithm and the Douglas–Peucker (DP) algorithm, typically use static thresholds based on fixed parameters like ship dimensions or predetermined distances, which limits their adaptive capabilities. In this paper, the adaptive core threshold difference-DP (ACTD-DP) algorithm is proposed based on traditional DP algorithm. Firstly, according to the course value of automatic identification system (AIS) data, the original trajectory data is preprocessed and some redundant points are discarded. Then the number of compressed trajectory points corresponding to different thresholds is quantified. The function relationship between them is established by curve fitting method. The characteristics of the function curve are analyzed, and the core threshold and core threshold difference are solved. Finally, the compression factor is introduced to determine the optimal core threshold difference, which is the key parameter to control the accuracy and efficiency of the algorithm. Five different algorithms are used to compress the all ship trajectories in the experimental water area. The average compression ratio (ACR) of the ACTD-DP algorithm is 87.53%, the average length loss ratio (ALLR) is 23.20%, the AMSED (mean synchronous Euclidean distance of all trajectories) is 68.9747 mx, and the TIME is 25.6869 s. Compared with the other four algorithms, the ACTD-DP algorithm shows that the algorithm can not only achieve high compression ratio, but also maintain the integrity of trajectory shape. At the same time, the compression results of four different trajectories show that ACTD-DP algorithm has good robustness and applicability. Therefore, ACTD-DP algorithm has the best compression effect.
Similar content being viewed by others
Introduction
With the deepening of global economic integration, the maritime industry’s role within the global economic system has become increasingly pivotal. The density of ship traffic in coastal ports, estuarial jetties, and other water areas continues to rise. The complex navigational environment presents novel challenges to maritime traffic supervision authorities and staff1,2. As an important carrier of ship motion information, AIS facilitates ship supervision and management. In accordance with the requirements of the SOLAS (International Convention for the Safety of Life at Sea), an increasing number of ships are mandated to be equipped with AIS devices to mitigate the risk of maritime collisions3.
The massive AIS data information provides significant support for research across various domains, including studies on ship behavior patterns4,5, maritime route planning6,7, and navigation safety8,9,10. However, the raw AIS data contains a plethora of redundant information, which can adversely affect the processing time of ship trajectories11 .Therefore, studies on the characteristic direction of ship trajectory all need to compress AIS data, such as ship navigation prediction12and ship abnormal behavior detection13,14.
Compression algorithms commonly used for AIS data can be categorized into offline and online approaches. Offline compression algorithms include the Douglas–Peucker (DP) algorithm15 and the TD-TR (top–down time ratio)algorithm16 etc., while online compression algorithms encompass the Sliding Window (SW) algorithm17 and the opening window time ratio (OPW-TR) algorithm18 etc. Among these, the selection of the threshold value is most critical when utilizing algorithms such as DP and SW for the compression of AIS data. The literature19 synthesized the ship’s course deviation, position deviation and the spatiotemporal characteristics of AIS data to set the threshold value of SW algorithm. This approach facilitates the extraction of key feature points from the ship trajectory, thereby ship trajectory data is compressed. But the choice of distance and angle thresholds needs to be analyzed according to the experimental waters and the type of vessel, the distance threshold in paper was set as [0.731, 1.274] times the ship width, and the Angle threshold was set as [\(\hbox {3.8}^{\circ }\),\(\hbox {5.0}^{\circ }\)]. The literature20 enhanced the traditional SW algorithm by combining speed values varied significantly. The threshold was dynamically adjusted to reduce the scale deviation caused by the speed difference. Consequently, it more effectively preserved the shape characteristics of the ship trajectory. However, this method is more suitable for trajectories with obvious velocity changes, and the improvement effect may not be as good as expected for trajectories with little velocity changes or simple motion modes. The literature21 initialized and dynamically adjusted the threshold of DP algorithm to proportion value of the ship’s length, based on the characteristics of ship trajectory.The literature statistically analyzed the relative azimuth differences between trajectory points to identify and order the features points, thereby yielded the final compressed trajectory. The literature uses the parameter self-selection method for the Silhouette Coefficient scores, but the determination of the optimal parameter combination will require extensive experiments and adjustments.Based on the traditional DP algorithm, the literature22 used the minimum ship domain evaluation method to optimize the threshold setting method, and combined the ship parameters and maneuverability to set the threshold to 0.8 times the ship length. However, the selection of threshold value is based on the empirical method, and the generalization ability of the method is insufficient. Based on the traditional DP algorithm and SW algorithm, literature23 combined with the space and motion characteristics of the trajectory, applied statistical theory to determine the algorithm threshold for compression. The paper recommended that the distance threshold value for DP algorithm was 0.8 times the ship length and the threshold coefficient of SW algorithm was 1.6.The threshold value in the literature is a fixed value and cannot provide an adaptive adjustment mechanism. The author24 compressed ship trajectory using DP algorithm, and a novel metric known as ACS(Average Compression Score) was introduced as an evaluative criterion. But the choice of the most threshold is still based on experience, especially the determination of the optimal ACS value. The distance threshold of DP algorithm was set 0.87 nautical miles, the ACS reaches the optimal value of 0.1472, and the compression rate can reach 92.59%. The literature25 , aiming to facilitate the rapid display of AIS trajectories on ECDIS (Electronic Chart Display and Information System), has utilized DP algorithm for compressing ship trajectory on nautical charts of varying scales. However, the threshold value is obtained based on the test of the research water area, and the generalization ability is weak. The results show that the recommended threshold range was [10,20] meters and the compression ratio was [94%,97%] for charts (scale 1:100,000 to 1:2990,000), while the recommended threshold was 20 meters and the compression ratio was 97% for charts (scale 1:3,000,000). The literature26 introduced the ADPS (Adaptive Douglas–Peucker with Speed) algorithm, which combined trajectory characteristics, the rate of change of ship speed, and the distances between feature points to automatically calculate and set thresholds suitable for each trajectory. This approach ensured the retention of key features and pertinent information while the trajectory was compressed. However, the performance of the algorithm still depends on the initial setting of some parameters, such as the method of determining the baseline. The literature27 used DP algorithm with distinct distance thresholds to compress the upbound and down-bound trajectories respectively. The DTW (Dynamic Time Warping) algorithm was used to evaluate the compression effect and solved the optimal compression threshold. However, the threshold determined by this method depends on the set of threshold values, so it is easy to fall into the local optimal value. The literature28 proposed the MPDP (Multi-Objective Peak Douglas-Peucker) algorithm, which adopted a peak sampling strategy. The three optimization objectives of the trajectory, spatial characteristics, course, and speed, were considered. Additionally, the obstacle detection mechanism was added, aiming to achieve a compression algorithm more suited for curved trajectories. However, the threshold of this method is still a fixed value and lacks adaptability. The literature29 proposed the PDP(Partition Douglas-Peucker) algorithm, which partitioned ship trajectory shapes and employs dynamic threshold settings to enhance the efficiency and accuracy of trajectory compression. This approach has successfully reduced the time required for compression and minimized data loss. However, the dynamic threshold determination within the PDP algorithm is still anchored to the ship’s length, which limits its adaptiveness. The literature30 proposed the ADP(Adaptive-threshold Douglas-Peucker) algorithm, which makes the threshold setting more flexible and accurate, and improves the computational efficiency of the algorithm by using matrix operations. However, when the cyclic ship trajectory was compressed by the ADP algorithm, the critical point information was lost more. At the same time, the complexity of ADP algorithm was higher to increase the calculation time and storage cost.
Summarizing the aforementioned research, the selection methods for threshold values in compressing ship trajectories based on DP and SW algorithms are primarily based on expert experience or trial-and-error approaches. The subjective of expert experience method is high, necessitating multiple iterations and experiments to ascertain the threshold values, resulting in low work efficiency. The thresholds set by the trial-and-error method are mostly based on the ship field or the ship static information, such as length and width, which are limited by the sailing waters and ship types. Furthermore, erroneous information within massive AIS datasets can also impact the setting of thresholds, thereby the compression effect can be diminished.
To overcome the deficiencies and limitations of DP algorithm in threshold setting, which is often based on expert experience or trial-and-error approaches, this paper proposes the ACTD-DP (Adaptive Threshold Difference -DP) algorithm. The ACTD-DP algorithm effectively reduces the reliance on experience or static ship information for threshold determination, demonstrating good applicability. Moreover, the ACTD-DP algorithm is suitable for compressing trajectory data of different types of water area and ship types, exhibiting strong robustness.
The remainder of this paper is organized as follows. In “Methods”, the traditional DP algorithm , relevant definitions and evaluation metrics are introduced, while the logic and flow of ACTD-DP algorithm are introduced. In “Experiment and analysis”, the experimental results of different algorithms are compared and analyzed in the selected research area. In “Discussion and conclusion”, this paper is discussed and summarized , and the directions of future research are outlined.
Methods
Traditional DP algorithm
The DP algorithm, initially introduced by Douglas and Peucker in 197331, is predicated on the concept of “straightening the curve” by approximating a curve with a series of points and reducing the number of points, which is commonly utilized for simplifying the motion trajectories of objects.
Assuming a curve is constituted by a set of points {\(P_{1},P_{2},P_{3},P_{4},P_{5},P_{6}\)} as shown in Fig. 1a, with a given threshold . The compression process of the DP algorithm is proceeds as follows:
-
(1) Connect the beginning and end of the curve \(P_{1}\) - \(P_{6}\) as the baseline line (dashed line in Fig. 1a), calculate the vertical distance between the remaining points and the reference line, and obtain the maximum distance value \( d_{max1} \)and corresponding points \(P_{4}\).
-
(2) If \(d_{max1}< \varepsilon \), all the intermediate points are removed, and the curve point set is compressed as \(\left\{ P_{1}, P_{6} \right\} \); otherwise, save \(P_{4} \) as the key point, and divide the curve into two sub-curve point sets: \(\left\{ P_{1}, P_{2}, P_{3}, P_{4} \right\} \) and \(\left\{ P_{4}, P_{5}, P_{6} \right\} \), as shown in Fig. 1b. The reference lines are \(P_{1}\) - \(P_{4}\) and \(P_{4}\) - \(P_{6}\) respectively.
-
(3) Repeat steps (1) and (2) for the above two sub-curves respectively, as shown in Fig. 1b, c. Finally, the compression curve is \(\left\{ P_{1}, P_{2}, P_{4}, P_{6} \right\} \) , as shown in Fig. 1d.
DP algorithm.
Relevant definition of trajectory compression
Projection
Ship trajectory is similar the curve shown in Fig. 1, thus by setting an appropriate threshold, the DP algorithm can detect and retain the critical points, discard non-critical points within the ship’s trajectory. During compressing the curve by DP algorithm, distances and thresholds are calculated based on the Cartesian coordinate system. However, ship trajectory point data are typically based on the geographic coordinate system, where the calculation of spherical distances is more complex, particularly when determining the distance between a trajectory point and a line.
Consequently, prior to the compression of ship trajectories by DP algorithm, a transformation from the geographic coordinate system to the Mercator projection coordinate system is required to facilitate the calculation of distance values, which is called projection.
\(\left( \lambda ,\varphi \right) \) denote the longitude and latitude value of the trajectory point in the geographical coordinate system, and \(\left( x,y \right) \) represents the coordinate value projected in the Mercator coordinate system. The conversion formula is as follows:
where, R represents the parallel circle radius of standard latitude; \(\delta \) represents the long radius of the earth’s ellipsoid; \(\varphi _{0} \) represents the standard latitude in the Mercator projection; e represents the first eccentricity of the earth ellipsoid; q represents isometric latitude.
Algorithm performance evaluation metrics
As shown in Fig. 2, let \(Tra_{org}=\left\{ P_{1},P_{2},\cdots ,P_{n} \right\} \) be the original ship trajectory, \(Tra_{cmp}=\left\{ P_{1},P_{n} \right\} \), and n be the compressed trajectory and n be the number of trajectory points. Each trajectory point encompasses two fundamental attributes: coordinate values and a timestamp, that is, \(P_{i}=\left\{ x_{i},y_{i},t_{i} \right\} \).
Trajectory compression diagram.
Definition 1
SED (synchronous euclidean distance). SED is generally used to evaluate the effectiveness of compression. SED denotes the Euclidian distance between the point P1 in the original trajectory \(Tra_{org}\) and the corresponding position \(P_{syn} \) in the compressed trajectory \(Tra_{cmp}\) , calculated in proportion to time. \(P_{ped} \) is the foot of \(P_{i} \) on the compression trajectory, \(P_{i}P_{ped} \) length is the vertical Euclidean distance corresponding to \(P_{i} \), and \(P_{i}P_{syn} \) length is the corresponding to \(P_{i} \). \(P_{syn} \left( x_{syn},y_{syn} \right) \) coordinates of trajectory points and corresponding SED formulas are as follows:
The mean SED of a single trajectory is denoted as MSED and the mean SED of all trajectories is expressed as AMSED. The formula is as follows:
where, N represents the number of all trajectories.
MSED represents the location discrepancy between the original trajectory and the compressed trajectory, and the smaller the value, the smaller the discrepancy, the smaller the trajectory distortion, and the higher the integrity of the trajectory shape.
Definition 2
CR (compression ratio). CR is the ratio of the number of points discarded during the compression process to the total number of points in the original trajectory. When only the CR is considered, the higher the CR, the better the compression effect of the compression algorithm. The formula is as follows:
ACR (average compression ratio) denotes the average compression ratio of all trajectories. The formula is as follows:
The CR indicates the compression degree of the algorithm on the trajectory. The larger the value, the higher the compression degree of the trajectory, and the simpler the compression trajectory obtained.
Define 3
LLR (length loss ratio). LLR is the ratio of the reduced length of the compressed trajectory to the length of the original track. The formula is as follows:
where, \(Len_{org}\) represents the length of the original trajectory; \(Len_{cmp}\) represents the length of the compressed trajectory.
ALLR (average length loss ratio) denotes to the mean LLR of all trajectories. The formula is as follows:
The LLR denotes the degree of loss on the track length. The larger the value, the more length information is lost during the compression process, and the greater the probability of deviation of some track features.
Define 4 TIME
TIME is the total time taken by the algorithm to compress all trajectories in the current research area.
ACTD-DP algorithm
The purpose of trajectory compression is to reduce the number of trajectory points while maintaining the integrity of the trajectory’s shape, thereby enhancing the velocity of trajectory display and processing. The goal is to achieve an optimal balance between the quantity and the quality of the trajectory points. The compression quality of DP algorithm is predominantly determined by the threshold value. The larger the threshold value, the higher the compression rate and the larger the trajectory distortion; the smaller the threshold value, the lower the compression rate and the smaller the trajectory distortion.
Traditional DP algorithms and the aforementioned studies in references22,23,24,25 set threshold values based on static information such as ship length, ship width, and fixed distance values, which diminishes the algorithm’s adaptive capacity. The ACTD-DP algorithm is proposed, as depicted in Fig. 6, drawing on the conceptual frameworks of the PDP algorithm29 and the APD algorithm30. The ACTD-DP algorithm employs an adaptive threshold difference approach, aiming to diminish reliance on static information. Initially, the original trajectory data is preprocessed using the course attribute from AIS data, the certain trajectory points are discarded. Subsequently, a curve fitting method is used to establish a functional relationship between the threshold and the number of trajectory points. The characteristics of the function curve are then analyzed to determine the core threshold and the core threshold difference. Finally, the compression factor is introduced to ascertain the optimal threshold difference, which serves as the key parameter to control the accuracy and efficiency of the algorithm. In comparison with the PDP and ADP algorithms, the ACTD-DP algorithm is capable of achieving a higher compression rate while maintaining the integrity of the trajectory shape. Additionally, the ACTD-DP algorithm demonstrates adaptability across various maritime environments and ship types.
Preprocessing trajectory
Compared with the traditional DP algorithm, ACTD-DP algorithm needs to solve the optimal core threshold difference, which may increase the algorithm’s execution time. To reduce the time consuming of the ACTD-DP algorithm, this paper processes the data and eliminate the noise data, drift data and other outliers in the data. After that, following the approach in reference17, this paper preprocesses the data based on the course differences between original trajectory points to reduce the number of trajectory points. The trajectory preprocessing method is as follows:
-
Let \(Tra_{org}\) be the original trajectory (Fig. 2) and \(\theta _{th} \) be the course change threshold . Then \(\left\{ P_{1},P_{n} \right\} \) is saved in the preprocessing trajectory.
-
Iteratively calculate the course difference between adjacent trajectory points as \(\Delta \theta \). If \(\Delta \theta > \theta _{th} \), the point is saved to middle \(Tra_{pre}\); Otherwise, the point is discarded.
-
To ensure data integrity, if the number of points discarded between adjacent two points in \(Tra_{pre}\) is higher than that of \(\Delta n\) (the general value is \(\left[ 10,30\right] \) ), trajectory points are selected from \(Tra_{org}\) in equal proportion with equal proportional interpolation and saved to \(Tra_{pre}\) .
The DR (discard ratio) is the ratio of the number of points discarded during the preprocessing of the trajectory to the total number of points in the original trajectory. Based on the experimental AIS data (as described in “Description of experimental data”), the relationship between DR and AMSED values with respect to threshold \(\theta _{th} \) is depicted in Fig. 3. When threshold \(\theta _{th} < \hbox {10}^{\circ } \) is set, both DR and AMSED exhibit more pronounced changes. At threshold \(\theta _{th} = \hbox {1}^{\circ } \), DR is 42.89% and AMSED is 8.5372 m. When threshold \(\theta _{th} = \hbox {10}^{\circ } \) is used, DR increases to 72.30% and AMSED to 17.30 m. At threshold \(\theta _{th} > \hbox {10}^{\circ } \), the variation in DR is relatively minor. Compared to the data in Table 5, the AMSED values presented in Fig. 3 are consistently lower, indicating minimal distortion of the trajectory and a minor impact on the trajectory’s shape features due to the discarded points. Consequently, the threshold \(\theta _{th} = \hbox {10}^{\circ } \) and the number of points \(\Delta n=10\) are selected for further analysis.
DR and AMSED values with respect to threshold \(\theta _{th} \).
Fitting thresholds-points function
The threshold of trajectory compression algorithms based on DP algorithm is directly correlated with the number of points in the compressed trajectory, which subsequently affects the quality of the compressed trajectory. To quantify this relationship between the threshold values and the number of trajectory points for the ACTD-DP algorithm, this paper , based on the experimental AIS data(as detailed in “Description of experimental data”), presents a statistical analysis of the total number of compression trajectory points across a threshold range of [0.01, 10] times the ship’s length, as depicted in Fig. 4a. Statistical analysis demonstrates a nonlinear negative correlation between threshold values and the count of trajectory points. Initially, when the threshold is smaller, there is a significant and rapid decrease in the number of trajectory points. Subsequently, with an increase in the threshold, the rate of reduction in trajectory points attenuates, and in certain regions, there may be observed slight oscillations or a tendency towards rebound.
According to the curve (Fig. 4a) analysis, the functional relationship between the thresholds and the number of trajectory points may conform to the characteristics of power function and exponential function. Therefore, the fitting function equations can be assumed to be:
where, \(\omega _{i}\left( i=1,2\ldots 6 \right) \) represents the equation parameters.
Then, The equations are solved according to the statistical data, and the parameter values are obtained as shown in Table 1. The fitting function curves were shown in Fig. 4b.
Thresholds-points function fitting. (a) Thresholds-points statistics and (b) the fitting function curves.
Finally, the determination coefficient is selected as the Goodness of Fit for the two functions. The determination coefficient, as \(R^2\) , is the description of the degree of variation of the function independent variable and dependent variable, and it is an essential metric for assessing the fit of a regression equation. The formula for its calculation is as follows:
where, \(S_{r}\) represents residual sum of squares; \(S_{t}\) represents total sum of squares; \(y_{i}\) represents the actual value; \(\widehat{y_{i} }\) represents the fitting curve value; \(\overline{y}\) represents the average value.
The determination coefficients for Eq. (8), \(R^2=0.9872\), and for Eq. (9), \(R^2=0.9589\), indicate that the values derived from Eq. (8) are closely aligned with the actual data, signifying a higher degree of fit for the curve equation and greater applicability. Consequently, the power function, the thresholds-points function, is selected to represent the functional relationship between the threshold and the number of trajectory points.
Optimal core threshold difference calculating
After the thresholds-points function has been established, the further analysis of the function curve characteristics is necessitated. This is essential for evaluating the compression effectiveness during the trajectory compression process. It is imperative to establish an intrinsic link between the curve and the ACTD-DP algorithm. Additionally, the role of thresholds or differences therein within the algorithm must be explicitly defined.
When the ACTD-DP algorithm compresses the trajectory, the maximum distance of all trajectory points to the baseline (similar to that depicted in Fig. 1) within each compressed trajectory segment is calculated. This maximum distance value is hypothesized as the core threshold for that trajectory segment, denoted as .The difference in core thresholds obtained from two consecutive compressions is denoted as the core threshold difference. The formula is as follows:
where, \(\Delta \varepsilon _{k}\) represents the core threshold difference at the second compression; \(\varepsilon _{core}^{k}\) represents the core threshold obtained during the second compression; k represents the compression order.
During the trajectory compression process, the overall trend of the core threshold difference is a gradual decrease (as shown in Fig. 5a). When \(\Delta \varepsilon _{k} < \Delta \varepsilon _{o} \le \Delta \varepsilon _{k-i} \left( i=1,2\ldots ,k-1 \right) \) is true, it indicates that the change trend of core threshold difference tends to be stable, the trajectory shape changes little, and the trajectory compression ratio also tends to be stable. At this point, \(\Delta \varepsilon _{o} \) is identified as the optimal core threshold difference, achieving the best balance betweenbetween the quantity and the quality of the trajectory points, signifying the completion of trajectory compression. Therefore, the process of balancing between the quantity and the quality of the trajectory points is the process of searching the optimal core threshold difference.
Combined with the curve (Fig. 4a) analysis of Equation (8), the core threshold difference corresponds to the derivative of the fitted curve (Fig. 5a). Consequently, the optimal core threshold difference can be calculated based on the angle difference between two points on the fitting curve (as illustrated in Fig. 5b), where the core threshold difference at the position of the maximum angle difference of the fitting curve is identified as the optimal core threshold difference. The formulas for calculating the derivative of the fitting curve and the angle difference are as follows:
The analysis (Fig. 5b) shows that when \(x=0.4\), the angle difference reaches the maximum, and the corresponding optimal core threshold difference is \(\Delta \varepsilon _{o} =3.097\).
The fitted curve analysis diagram. (a) The derivative of the fitted curve and (b) the angle difference of the fitted curve.
Compression factor
Different purposes of trajectory research require different compression effects. When studying the traffic flow state of water area, the higher compression ratio is preferred; whereas when examining the ship navigation state, the higher quality of compression is more desirable. To meet the diverse research objectives, this paper introduces a compression factor, denoted as \(\rho \), based on the optimal core threshold difference, with a default value of 0.5.
In summary, after the optimal core threshold difference is obtained, the trajectory compression can be carried out. The ACTD-DP Algorithm flow is shown in Fig. 6, and the algorithm code is shown in Algorithm 1.
ATD-DP (Part 1).
Algorithm flow.
ATD-DP (Part 2).
Experiment and analysis
Description of experimental data
In order to verify the scientificity of the proposed algorithm, the AIS data of Zhoushan waters in China is selected as the experimental water area shown in Table 2, and the time range is from 00:00 to 24:00 on May 1st, 2021. According to the ship MMSI statistical analysis, the experimental waters shared 515 ships and 404,646 AIS data. The statistics of ship types are shown in Table 3, and the statistics of ship dimensions are shown in Table 4.
The experimental water area and the original ship trajectories are depicted in Fig. 7a. The raw data contains a significant amount of noise and redundant information. Preprocessing is conducted according to the method described in Section 2.3.1, and the results are presented in Fig. 7b. The experiment was implemented on the computer(Window64 Intel(R) Xeon(R) Gold 5218R CPU @ 2.10 GHz 2.10 GHz and 32.0 GB of RAM) using MATLAB(version R2024a32).
The experimental water area and AIS data (The color curves are ship trajectories. The yellow polygon area is the land part of the experimental area. The white background indicates the experimental water area. The figure is drawn using MATLAB, which version is R2024a32.). (a) The original trajectories and (b) the preprocessing trajectories.
Compression results of all trajectories
To compare the compression effects of different algorithms, the SW algorithm from the reference19 (with the distance threshold of 0.8 times the ship width), the DP algorithm from the reference22 (with a distance threshold of 0.8 times the ship length), and the PDP (Peak–Douglas–Peucker) algorithm from the reference31 (with the distance threshold of 0.5 times the ship length and an angular threshold of \(\hbox {10}^{\circ }\)) and the ADP algorithm from the reference30 (with the optimal threshold change rate of 1.36) are selected as the comparative algorithms. All trajectories are compressed by different algorithms, and the evaluation metrics values obtained are shown in Table 5 and Fig. 8.
Evaluation metric values comparison chart.
By analyzing Table 5 and Fig. 8, it is obvious that the evaluation metric values corresponding to the five types of algorithms are different. The TIME value of the SW, DP, and PDP algorithms are obviously smaller than that of the ADP and ACTD-DP algorithms, indicating that the former trio exhibits lower complexity and superior computational efficiency, while the latter two algorithms, due to their trajectory division requirements, exhibit higher complexity and thus comparatively reduced computational efficiency. ADP algorithm has the lowest computational efficiency.
The SW algorithm exhibits the lowest values in terms of ACR and ALLR, yet it presents a notably high AMSED. This observation indicates that the SW algorithm, during the trajectory compression process, retains a relatively large number of non-critical points, leading to minimal discrepancies in length between the compressed and original trajectories. However, the location discrepancy is significantly pronounced, thereby yielding the least effective trajectory compression outcome among the evaluated methods.
The DP algorithm is characterized by the highest values in ACR,ALLR, and AMSED. These metrics suggest that the DP algorithm achieves a significant reduction in trajectory data, indicative of its pronounced compression efficacy. However, the substantial elimination of critical points results in considerable discrepancies in both length and location discrepancy between the compressed and original trajectories. Consequently, the DP algorithm’s trajectory compression performance is deemed to be of moderate quality.
The PDP algorithm exhibits a high ACR and the lowest ALLR among the evaluated methods, while its AMSED is relatively elevated yet notably lower than that of the DP algorithm. These observations indicate that the PDP algorithm exhibits a marked optimization over the traditional DP algorithm, yielding a more favorable trajectory compression outcome.
The ADP algorithm is distinguished by its higher ACR and ALLR, both of which surpass the corresponding values of the PDP algorithm. Additionally, the ADP algorithm exhibits a lower AMSED, signifying a reduced location discrepancy between the compressed and original trajectories. Collectively, these metric values underscore the ADP algorithm’s notable effectiveness in trajectory compression.
The ACTD-DP algorithm is notable for its minimal AMSED, which is significantly lower than the values observed in other algorithms. This algorithm also presents a higher ACR and a lower ALLR. These metric values indicate that the ACTD-DP algorithm excels in retaining critical points during the trajectory compression process, leading to a relatively higher length discrepancy but a markedly reduced location discrepancy when compared to the original trajectories. Consequently, the ACTD-DP algorithm is recognized for its superior compression performance.
Compression results of different types trajectories
To further demonstrate the robustness and applicability of the ACTD-DP algorithm, four ship trajectories were randomly selected for compression. The information for the chosen ships and their trajectories is presented in Table 6.
Different algorithms compress the above trajectories respectively, and statistical evaluation metric values are shown in Table 7. The TIME values corresponding to the compression of different ship trajectories by various algorithms align with the trends observed in Section 3.2, demonstrating that the TIME values for the ADP and ACTD-DP algorithms are significantly higher than those for the other algorithms.
Ship1’s course changes are small, the trajectory shape is simple, and the voyage is almost straight. MSED value of SW algorithm is the largest (62.7271 m, approximately1.1 times the ship length), and CR is suboptimal. The SW algorithm demonstrates the weaker capability in detecting and retaining critical points, particularly in areas where the trajectory’s curvature is low (region A in the Fig. 9, where only one point is retained), leading to significant distortion in the trajectory. The DP algorithm exhibits the highest values for both CR and LLR, with the suboptimal MSED value. The DP algorithm performs the poorest in detecting and retaining critical points( region A in the Fig. 9,where no criticalpoints are retained), leading to the least effective compression effect. The PDP algorithm demonstrates the lowest values for LLR, with the MSED and CR being both smaller. The PDP algorithm demonstrates the stronger capability in detecting and retaining critical points, particularly handling the junctions between straight lines and curves more effectively than the DP algorithm(region A in the Fig. 9, where three points are retained). The ADP algorithm exhibits the lowest value for CR, and LLR and MSED value are both smaller. APD algorithm has the strongest ability to retain critical points, resulting in an excessive number of trajectory points at the junctions between straight lines and curves (region A in the Fig. 9, where ten points are retained).The ACTD-DP algorithm exhibits the lowest MSED value (42.907 m, approximately 0.8 times the ship length), CR and LLR are better. The ACTD-DP algorithm exhibits the strongest capability in detecting and retaining critical points, handling the transitions between straight lines and curves effectively ( region A in the Fig. 9, where three points are retained too), resulting in minimal trajectory distortion and the best compression effect.
Ship2 is the fishing vessel with the lowest number of track points. However, the course changes are large and frequent, there are large angle turning and U-turn phenomena, and the navigation trajectory is the most complicated. Compared with the other three trajectories, the five algorithms yield lower CR and LLR values for this trajectory (with the best CR being 77.58% from the DP algorithm and the best LLR being 4.44% from the ADP algorithm), and the performance of MSED values is poor. The SW algorithm exhibits the lower MSED value (66.0019 m, approximately 1.8 times the ship length), while the DP algorithm exhibits the highest MSED value (155.6476 m, about 4.3 times the ship length). CR and MSED of the ACTD-DP algorithm are suboptimal. Consequently, the ACTD-DP algorithm maintains a good compression effect with the high CR and the low MSED. However, from the evaluation metrics analysis of the four trajectories compressed by ACTD-DP algorithm, the compression effect of this trajectory is the worst.
Ship3’s trajectory is relatively complex, with large course changes and sharp turns. However, due to the large proportion of straight line segments in the trajectory, the compression effect of each algorithm is similar to that of ship1. The CR Value of the SW algorithm is the highest, while its LLR and MSED values are the second highest among the evaluated methods. But the compression effect is poor for the location where the curvature of the trajectory curve changes frequently (region B in the Fig. 9). The DP algorithm demonstrates the highest values for CR, LLR and MSED(199.1665 m, approximately 1.3 times the ship length). This indicates a weaker capability in detecting critical points, resulting in the fewest number of critical points retained (region B in the Fig. 9). Consequently, the DP algorithm is associated with the poorest compression performance among the evaluated methods. The PDP algorithm exhibits the second lower CR, indicative of a relatively higher retention of trajectory points. Its LLR and MSED values are moderate, reflecting a balanced performance in terms of trajectory fidelity and compression efficiency.The PDP algorithm effectively captures critical points in the trajectory with significant changes in course, yet the fixed threshold discards many critical points (region B in the Fig. 9). Despite this, the overall compression performance of the PDP algorithm is commendably effective. The ADP algorithm exhibits the lowest values for CR, LLR and MSED(12.3542 meters, approximately 0.08 times the ship length). Compared to the PDP algorithm, the ADP algorithm demonstrates a heightened ability to detect critical points within the curved segments of the trajectory. However, it is noted that the ADP algorithm retains an excessive number of critical points(region B in the Fig. 9). This over-retention, while enhancing detection, may lead to a less efficient compression outcome.The ACTD-DP algorithm is distinguished by a high CR and is second only in terms of the lowest LLR and MSED, with the MSED (20.507 m, approximately 0.14 times the ship length). Comparatively, the ACTD-DP algorithm outperforms the ADP algorithm in the detection of critical points. During the compression of linear segments of trajectories, the algorithm retains a greater number of critical points. In the compression of curved trajectory segments, the number of retained critical points is moderate (region B in the Fig. 9). Therefore, the ACTD-DP algorithm is recognized for its superior compression efficacy.
Ship4’s trajectory is relatively simple with distinct boundaries between straight and curved segments. However, there is a high number of anchoring points, constituting 72% of the total number of points. As a result, the compression of the anchoring paths is significant, leading to the higher LLR values.The SW algorithm exhibits the lowest values for both CR(only 22.05%) and LLR, with the relatively large MSED. The ability of SW algorithm to detect the critical points of the line segments and the curve segments is quite different. And the SW algorithm fails to effectively process critical points in anchoring trajectories (region C in the Fig. 9), with the CR of less than 1% for such trajectories, thus resulting in the poorest compression performance. The DP algorithm demonstrates the highest values for CR, LLR and MSED(373.5619 meters, approximately 1.7 times the ship length). This indicates a significant reduction in trajectory detail, which adversely affects the detection capability for critical points, particularly in anchoring trajectories(region C in the Fig. 9). The DP algorithm’s approach compresses the entire anchoring trajectory into a single critical point, which fails to meet the research requirements for analyzing anchorage stay patterns and utilization rates. The PDP algorithm exhibits a lower CR and a suboptimal MSED, suggesting a more conservative compression strategy. While the PDP algorithm performs well in compressing linear and typical curved trajectories, its heightened sensitivity to transitional segments leads to a less effective compression of anchoring trajectories(region C in the Fig. 9).This overemphasis on detecting connections between trajectory segments may compromise the fidelity of the compressed trajectory in representing the original anchoring behavior. The ADP algorithm exhibits moderate values for CR, LLR and MSED. Compared to the PDP algorithm, the ADP algorithm demonstrates enhanced capabilities in detecting and retaining critical points, particularly at the junctions between linear and curved segments of trajectories. However, akin to the DP algorithm, the ADP algorithm struggles to effectively process anchoring trajectories (region C in the Fig. 9)), thereby limiting its utility in accurately capturing the nuances of such trajectories.The ACTD-DP algorithm is distinguished by the second-highest values for CR and LLR and the lowest MSED (24.9138 m, approximately 0.11 times the ship’s length). This algorithm excels in the detection and retention of critical points, particularly with a uniform distribution of critical points in linear trajectory segments. In comparison to the ADP algorithm, the ACTD-DP algorithm also demonstrates efficacy in handling anchoring trajectories, capturing the typical positions and trajectories during the ship’s anchoring process(region C in the Fig. 9)). Consequently, upon comprehensive analysis of the compression effects, the ACTD-DP algorithm is concluded to provide the optimal compression performance.
Trajectory compression comparison.
Discussion and conclusion
Building upon the traditional DP algorithm31 and drawing inspiration from the methodology of the PDP algorithm29 and the ADP algorithm30, the ACTD-DP algorithm is proposed and experimental validation is conducted. The threshold values of other comparison algorithms are based on static information (ship length, ship width, fixed distance value, etc.), and the trajectory compression effects are greatly affected by the external environment, and the algorithms have poor adaptive ability. In contrast, the ACTD-DP algorithm employs the optimal threshold difference method, reducing reliance on fixed thresholds and enhancing the robustness and applicability of the algorithm. From the overall analysis of compression effects (Table 5 and Fig. 8), compared to the other four algorithms, the ACTD-DP algorithm demonstrates the strongest capability in detecting and retaining key points. It maintains the smallest AMSED value while preserving the higher ACR value, resulting in the best compression performance. Analyzing the evaluation metrics for the four trajectories, the ACTD-DP algorithm exhibits the best compression performance for all trajectories except Ship2’s trajectory, demonstrating strong adaptability to different trajectories.
However, the ACTD-DP algorithm also has a notable drawback. The ACTD-DP algorithm requires curve fitting and the calculation of core thresholds and optimal threshold difference. Consequently, the computational complexity is relatively high, leading to increased algorithmic execution time. Concurrently, the ACTD-DP algorithm yields the lower CR for trajectories with a limited number of points and abrupt changes in course, such as Ship2. The compression performance for these types of trajectories could be further enhanced. These observations also provide directions for future research endeavors.
Data availability
The data that support the findings of this study are available from Shanghai Maritime Safety Administration but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. The datasets generated and analysed during the current study are not publicly available due data sensitivity but are available from the corresponding author on reasonable request.
References
Huang, C. et al. A simulation model for marine traffic environment risk assessment in the traffic separation scheme. In 2021 6th International Conference on Transportation Information and Safety (ICTIS). 213–221 https://doi.org/10.1109/ICTIS54573.2021.9798623 (IEEE, 2021).
Kim, E. et al. Sensitive resource and traffic density risk analysis of marine spill accidents using automated identification system big data. J. Mar. Sci. Appl. 19, 173–181. https://doi.org/10.1007/s11804-020-00138-2 (2020).
An, K. E-navigation services for non-SOLAS ships. Int. J. e-Navigat. Maritime Econ. 4, 13–22. https://doi.org/10.1016/j.enavi.2016.06.002 (2016).
Barco, S. G., Lockhart, G. G. & Swingle, W. M. Using RADAR & AIS to investigate ship behavior in the Chesapeake Bay ocean approach off of Virginia, USA. In 2012 Oceans. 1–8 https://doi.org/10.1109/OCEANS.2012.6404872 (IEEE, 2012).
Lei, P.-R., Yu, P.-R. & Peng, W.-C. A Framework for maritime anti-collision pattern discovery from AIS network. In 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS). 1–4 https://doi.org/10.23919/APNOMS.2019.8892899 (IEEE, 2019).
Han, P. & Yang, X. Big data-driven automatic generation of ship route planning in complex maritime environments. Acta Oceanol. Sin. 39, 113–120. https://doi.org/10.1007/s13131-020-1638-5 (2020).
Hao, Y., Zheng, P. & Han, Z. Automatic generation of water route based on AIS big data and ECDIS. Arab. J. Geosci. 14, 1–8 https://doi.org/10.1007/s12517-021-06930-w (Springer, 2021).
Blindheim, S., Johansen, T. A. & Utne, I. B. Risk-based supervisory control for autonomous ship navigation. J. Mar. Sci. Technol. 28, 624–648. https://doi.org/10.1007/s00773-023-00945-6 (2023).
Guo, Y. & Ding, Z. Application of big data in analyzing the impact of explosive cyclone on ship navigation safety. In 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC). 910–913 https://doi.org/10.1109/ICOSEC51865.2021.9591958 (IEEE, 2021).
Liu, Z., Li, Y., Zhang, Z., Yu, W. & Du, Y. Spatial modeling and analysis based on spatial information of the ship encounters for intelligent navigation safety. Reliabil. Eng. Syst. Saf. 238, 109489. https://doi.org/10.1016/j.ress.2023.109489 (2023).
Makris, A. et al. Evaluating the effect of compressing algorithms for trajectory similarity and classification problems. GeoInformatica 25, 679–711. https://doi.org/10.1007/s10707-021-00434-1 (2021).
Chen, J., Zhang, J., Chen, H., Zhao, Y. & Wang, H. A TDV attention-based BiGRU network for AIS-based vessel trajectory prediction. iScience 26, 106383 https://doi.org/10.1016/j.isci.2023.106383 (2023).
Liu, H., Liu, Y. & Zong, Z. Research on ship abnormal behavior detection method based on graph neural network. In 2022 IEEE International Conference on Mechatronics and Automation (ICMA). 834–838 https://doi.org/10.1109/ICMA54519.2022.9856198 (IEEE, 2022).
Murray, B. & Perera, L. P. Ship behavior prediction via trajectory extraction-based clustering for maritime situation awareness. J. Ocean Eng. Sci. 7, 1–13. https://doi.org/10.1016/j.joes.2021.03.001 (2022).
Jeong, S. & Kim, T.-W. Generating a path-search graph based on ship-trajectory data: Route search via dynamic programming for autonomous ships. Ocean Eng. 283, 114503. https://doi.org/10.1016/j.oceaneng.2023.114503 (2023).
Meratnia, N. & De By, R. A. Spatiotemporal compression techniques for moving point objects. In (Goos, G. et al. eds.) Advances in Database Technology-EDBT 2004. Lecture Notes in Computer Science. Vol. 2992. 765–782 https://doi.org/10.1007/978-3-540-24741-8_44 (Springer, 2004).
Jie, G., Xin, S., Xiaowei, S. & Daofang, C. Online compression algorithm of fishing ship trajectories based on improved sliding window. J. Shanghai Maritime Univ. 44, 17–24 https://doi.org/10.13340/j.jsmu.202208070214 (2023).
Zhu, F. & Ma, Z. Ship trajectory online compression algorithm considering handling patterns. IEEE Access 9, 70182–70191. https://doi.org/10.1109/ACCESS.2021.3078642 (2021).
Gao, M. & Shi, G.-Y. Ship spatiotemporal key feature point online extraction based on AIS multi-sensor data using an improved sliding window algorithm. Sensors 19, 2706. https://doi.org/10.3390/s19122706 (2019).
Qi, Z., Yi, C., Li, X. & Wen, G. Improved sliding window trajectory compression algorithm considering motion characteristics. J. Geomat. Sci. Technol. 37, 622–627 (2020) (5 citations (CNKI)[2024-3-13]).
Bai, X., Xie, Z., Xu, X. & Xiao, Y. An adaptive threshold fast DBSCAN algorithm with preserved trajectory feature points for vessel trajectory clustering. Ocean Eng. 280, 114930. https://doi.org/10.1016/j.oceaneng.2023.114930 (2023).
Zhang, S.-K., Shi, G.-Y., Liu, Z.-J., Zhao, Z.-W. & Wu, Z.-L. Data-driven based automatic maritime routing from massive AIS trajectories in the face of disparity. Ocean Eng. 155, 240–250. https://doi.org/10.1016/j.oceaneng.2018.02.060 (2018).
Wei, Z., Xie, X. & Zhang, X. AIS trajectory simplification algorithm considering ship behaviours. Ocean Eng. 216, 108086. https://doi.org/10.1016/j.oceaneng.2020.108086 (2020).
Huang, C., Qi, X., Zheng, J., Zhu, R. & Shen, J. A maritime traffic route extraction method based on density-based spatial clustering of applications with noise for multi-dimensional data. Ocean Eng. 268, 113036. https://doi.org/10.1016/j.oceaneng.2022.113036 (2023).
Shukai, Z., Zhengjiang, L., Xianku, Z., Guoyou, S. & Yao, C. A method for AIS track data compression based on Douglas–Peucker algorithm. J. Harbin Eng. Univ. 36, 595–599 (2015).
Li, H. et al. Unsupervised hierarchical methodology of maritime traffic pattern extraction for knowledge discovery. Transport. Res. Part C Emerg. Technol. 143, 103856. https://doi.org/10.1016/j.trc.2022.103856 (2022).
Cui, C. & Dong, Z. Ship space-time AIS trajectory data compression method. In 2022 7th International Conference on Big Data Analytics (ICBDA). 40–44 https://doi.org/10.1109/ICBDA55095.2022.9760355 (IEEE, 2022).
Zhou, Z., Zhang, Y., Yuan, X. & Wang, H. Compressing AIS trajectory data based on the multi-objective peak Douglas–Peucker algorithm. IEEE Access 11, 6802–6821. https://doi.org/10.1109/ACCESS.2023.3234121 (2023).
Zhao, L. & Shi, G. A method for simplifying ship trajectory based on improved Douglas–Peucker algorithm. Ocean Eng. 166, 37–46. https://doi.org/10.1016/j.oceaneng.2018.08.005 (2018).
Tang, C. et al. A method for compressing AIS trajectory data based on the adaptive-threshold Douglas–Peucker algorithm. Ocean Eng. 232, 109041. https://doi.org/10.1016/j.oceaneng.2021.109041 (2021).
Douglas, D. H. & Peucker, T. K. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. In Classics in Cartography (Dodge, M. ed.). 1 Ed. 15–28 https://doi.org/10.1002/9780470669488.ch2 (Wiley, 2011).
MathWorks, I. Matlab. https://www.mathworks.com/products/matlab/ (2024).
Acknowledgements
This study is supported by the research project “Science and Technology Commission of Shanghai Municipality” (Grant Nos. 22010502000 and 23010501900).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, T., Wang, Z. & Wang, P. A method for compressing AIS trajectory based on the adaptive core threshold difference Douglas–Peucker algorithm. Sci Rep 14, 21408 (2024). https://doi.org/10.1038/s41598-024-71779-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-71779-4
Keywords
This article is cited by
-
Unraveling "waves" in liner shipping. A 2D method for visualizing individual vessel trajectories with AIS data
WMU Journal of Maritime Affairs (2026)













