Unveiling individual and collective temporal patterns in the tanker shipping network

Teo, Kevin; Arnold, Naomi; Hone, Andrew; Coulon, Michael; Ireland, Martin; Santillana, Mauricio; Kiss, István Z.

doi:10.1038/s41467-026-70013-1

Download PDF

Article
Open access
Published: 27 February 2026

Unveiling individual and collective temporal patterns in the tanker shipping network

Nature Communications volume 17, Article number: 3300 (2026) Cite this article

5263 Accesses
2 Citations
14 Altmetric
Metrics details

Subjects

Abstract

The global oil tanker shipping network emerges from individual ship and fleet decisions driven by economic, environmental, and operational factors. However, most existing shipping network analysis rely on static, time-aggregated representations, overlooking critical temporality connecting individual vessel routing strategies with both operational efficiency and global cargo flows. To address this gap, we introduce a dual-scale framework complementing sequential motif analysis—capturing recurrent patterns in vessel movement sequences—with Dynamic Mode Decomposition (DMD), extracting temporal dynamics from vessel trajectories to global cargo flows. Using tanker movement data across four vessel classes, we demonstrate that vessels exhibiting diverse regional exploration patterns spend up to 50% more time transporting rather than seeking cargo, indicating greater economic and environmental efficiency. At the system scale, DMD analysis reveals distinct seasonality with an average peak-to-trough amplitude of 16%. Major import regions show synchronous annual demand cycles, while export regions exhibit anti-synchronicity. These temporal patterns, invisible to static analysis, reveal performance differences that enable route optimization for both economic and environmental benefits.

Dynamic fleet management of waterborne vessels with mixed passenger and parcel services

Article Open access 14 April 2025

Dynamic monitoring of dust transport effect on maritime visibility using multi source satellite data and advanced deep learning approach

Article Open access 21 October 2025

Enhancing global maritime traffic network forecasting with gravity-inspired deep learning models

Article Open access 19 July 2024

Introduction

The global maritime network handles over 80% of international trade by volume, with crude oil and petroleum products alone accounting for nearly 30% of this market^1,2. Despite its economic significance and substantial environmental impact^3,4,5,6, maritime shipping networks remain understudied compared to other transportation systems, particularly regarding their temporal operational dynamics⁷. This research gap is especially pronounced for oil tanker networks, which exhibit fundamentally different operational characteristics from the more extensively researched container shipping networks, where fixed schedules and regular route structures have made traditional complex network analysis particularly effective for understanding system-level properties⁸.

Current maritime network research has predominantly focused on static, time-aggregated analyses, with four main analytical themes: trade and connectivity; hubs and centrality; vulnerability and robustness; and communities and spatial structure^9,10,11. These studies established that global shipping is dominated by hub-and-spoke structures, where major ports are connected to numerous smaller ports, while other important ports act as regional gateways controlling the flow of goods between different geographic areas^{8,12,13,14,15,16}. Moreover, such analyses have also been used to compare changes in network connectivity and regional communities at different time periods, often before and after significant geopolitical events^17,18,19. These studies are commonly interpreted through the lens of economic and ecological impacts^15,20,21,22.

Despite these insights, static approaches that aggregate temporal data obscure critical operational patterns. When networks are analyzed as snapshots, essential sequential information is lost: the timing and order of vessel movements between regions, and the dynamic variation in cargo flows over time. This information loss masks valuable operational insights, as vessels following identical aggregate movement patterns between the same ports may employ fundamentally different sequential strategies that significantly impact their efficiency and market positioning. These different temporal routing patterns create distinct operational outcomes that remain invisible in static networks.

These temporal dynamics are especially critical in oil tanker movements, which are operationally different compared to container liners. Unlike container ships operating on fixed schedules along established routes, tankers operate in charter-based markets where individual vessels compete for cargo, creating highly flexible and responsive movement patterns⁸. These competitive dynamics generate precisely the type of sequential, temporal dependencies that static aggregation eliminates—as vessels dynamically adjust their routing sequences based on market opportunities and cargo availability. Recent work on oil tanker networks has examined traffic density²³, hub-and-spoke structures²⁴, and influential port identification¹⁴, yet these studies maintain the static network paradigm and do not capture the temporal operational patterns that may distinguish successful from unsuccessful competitive strategies.

To overcome these analytical limitations and uncover the hidden temporal structure, we demonstrate in this study that sequential motif analysis (which identifies recurring patterns of regional visits) and Dynamic Mode Decomposition (DMD, a method for extracting temporal patterns from time series data) reveal previously undetected patterns spanning both individual vessel operations and global flow dynamics. This dual-scale approach links routing strategies to efficiency outcomes and cargo flows to seasonal cycles. Specifically, we show that ships exhibiting diverse regional exploration patterns—captured through sequential motifs—achieve significantly better laden-ballast ratios (LBRs), a key efficiency measure closely related to the industry-standard Energy Efficiency Operational Indicator. In parallel, DMD applied to regional cargo flow time series reveals that different ship classes operate with distinct regional footprints and specialized seasonal trading patterns. We identify previously uncharacterized synchronous and anti-synchronous relationships between maritime regions that correlate with economic and climate cycles. While these analyses operate at different scales—individual vessel behavior versus collective fleet dynamics—they provide complementary insights into the multi-scale temporal structure of maritime operations.

Results

Vessel data description

We obtained proprietary data from our partner company AlphaOcean, covering laden (loaded with cargo) voyages taken by 3026 medium to large crude-oil and petroleum tankers from 2016 January to 2020 February; see illustration in Fig. 1. This consists of 452 Panamax, 1141 Aframax, 610 Suezmax, and 823 Very Large Crude Carrier (VLCC), listed in order of their size in deadweight tonnage. While the locations provided are accurate up to the port-level, our analysis condenses this to a regional level, grouping 1337 ports into 26 regions. This aligns with industry practices of pricing freight rate indices on a region-to-region basis and reflects the uncertainty that exact discharge ports are often not specified to operators until late in the voyage.

**Fig. 1: Schematic representation of voyages in time and space of two ships.**

Ship routing performance indicators

The tanker ships in our study typically do not travel on fixed routes, nor do they service particular countries or regions exclusively. Instead, they are free to compete for cargo across different markets and regions, with their movements driven by economic considerations such as travel times, fuel and staffing costs, potential earnings, and other market opportunities. This operational flexibility presents a unique challenge for tankers: how do we evaluate the routing performance of ships?

A key challenge in evaluating tanker routing performance lies in the inaccessibility of comprehensive voyage and vessel data. We propose an accessible metric using temporal data—the LBR,

$${{{\rm{LBR}}}}=\frac{{\sum }_{l\in L}\Delta {t}_{l}}{{\sum }_{l\in L}\Delta {t}_{l}+{\sum }_{b\in B}\Delta {t}_{b}},$$

(1)

where Δt_l is the at-sea duration of laden leg l ∈ L (adjusted for port waiting times; see Methods for details and robustness checks) and Δt_b is the duration of ballast leg b ∈ B, where L and B are the sets of laden and ballast legs, respectively, of a ship’s trajectory. The ratio above simply measures the proportion of time spent laden at sea out of the total time at sea. The LBR is designed to capture the relative amount of time a ship spends performing useful work (transporting cargo) versus non-revenue generating operations (repositioning to load cargo). By focusing solely on temporal patterns of laden versus ballast operations, the LBR excludes ship-specific technical factors such as fuel consumption patterns or engine specifications. This design allows us to avoid scenarios where a vessel with efficient engines but poor routing outperforms another vessel with less efficient engines but a superior routing strategy. This enables fleet-level comparisons where comprehensive operational data remains unavailable.

The stratification of our analysis by vessel class (Panamax, Aframax, Suezmax, VLCC) controls for substantial inter-class variability in vessel characteristics and operational profiles. Within each class, multivariate analysis controls for vessel age, effectively isolating routing efficiency from vessel-specific factors. Furthermore, as demonstrated in Supplementary Eqs. (1)–(5), LBR captures fundamentally similar efficiency patterns to the International Maritime Organization’s Energy Efficiency Operational Indicator (EEOI)²⁵, which measures CO₂ emissions per unit of cargo work. While EEOI remains an industry gold standard, its calculation requires proprietary fuel consumption and precise route distance data typically unavailable in fleet-level research datasets. Our LBR measure, though simplified, provides a theoretically grounded and empirically useful measure that reflects revenue-optimizing incentives, enabling large-scale analysis of routing efficiency patterns across global shipping networks.

The LBR distributions in Fig. 2a expose significant performance disparities across each ship class, with the upper quartile (top 25%) displaying substantially higher values compared to the lower quartile (bottom 25%). Comparing median values between these groups reveals that the upper quartile achieves LBRs approximately 50% higher than the lower quartile. This clear difference in time spent carrying versus seeking cargo indicates significant heterogeneity in operational strategies within identical ship classes. Such performance differentials suggest that optimizing routing strategies could substantially improve both economic outcomes and environmental performance.

**Fig. 2: Distribution of vessel measure data across classes.**

This performance landscape reveals distinct efficiency profiles that align with each vessel class’s operational niche. Panamax, Aframax, and Suezmax vessels show comparable LBRs with overlapping median ranges, while VLCCs achieve marginally higher values due to their specialized operational advantages. As the largest vessel class, VLCCs are often deployed on inter-regional long-haul routes inaccessible to smaller ships, maximizing their at-sea laden time while minimizing time spent waiting in port. These longer voyage durations enable easier fixture planning, reducing inter-voyage delays that would otherwise erode efficiency.

Tanker vessels exhibit remarkable freedom to explore diverse markets rather than operating from fixed regional bases. Analysis of total regional coverage, Fig. 2b, uncovers distinct operational strategies across ship classes that directly reflect their design constraints and market positioning. Panamax and Suezmax vessels demonstrate similar exploration capabilities, visiting a median of 12 regions, indicating comparable operational flexibility despite size differences. Aframax vessels show the greatest variation in regional diversity with broader distribution patterns (median: 9 regions), reflecting their widespread usage, superior versatility for different cargo types, and port accessibility. VLCCs operate under significant geographical constraints, visiting only 7 regions on average with markedly narrow distribution. This restricted mobility stems from their specialized infrastructure requirements—as the largest vessels in the global fleet, VLCCs can only access ports with sufficient depth and handling capacity for their extreme size.

Extracting sequential motifs

Ship routing decisions involve complex factors that create distinctive movement signatures, exposing the strategic thinking behind high-performance operations. Consider two ships visiting identical regions: Ship 1 follows (Middle East, China, Southeast Asia, Middle East, China, Southeast Asia)—systematically cycling through three regions—while Ship 2 alternates (Middle East, China, Middle East, Southeast Asia, China, Southeast Asia)—return journeys between pairs of regions. In this example, time-aggregated networks would show identical connectivity between these three major trading regions, yet their sequential trajectories reveal fundamentally different strategies: systematic regional diversification versus back-and-forth movements between trading partners. To better understand the routing patterns that arise from these decision processes, we analyze short repeating sequences known as sequential motifs.

Previous studies have shown that ship trajectories can be modeled statistically using higher-order Markov chains, where future locations depend not only on the current position but also on recent prior locations^26,27. Higher-order networks generalize this correlation structure of individual agent trajectories to a network representation where nodes represent trajectory subsequences rather than individual ports, embedding these sequential dependencies in the network, where the order indicates the correlation length captured²⁸. Utilizing this framework on our tanker dataset, we systematically identified order-2 as the optimal order based on information-theoretic model selection criteria (see “Methods”). Intuitively, this means that routing decisions depend on both the previous location and current location, but extending knowledge further back adds little predictive value while dramatically increasing model complexity. Building on the insights of this framework, we examine ship movements through 2-hop motifs to uncover the relationship between routing patterns and operational routing performance.

Five distinct strategic patterns emerge from routing decisions. These are the five possible 2-hop motifs, which we call AAA, ABA, AAB, ABB, and ABC, illustrated in Fig. 3a. The specific regional labels are not relevant; instead, motifs are identified by the topological relations between elements in the motif. For example, a journey sequence (China, Japan, China), (Japan, China, Japan), and (UK, Mediterranean, UK) all share the same underlying pattern: a ship visits region A, then region B, then returns to region A. This maps onto the motif ABA, regardless of the regions involved. Thus, ABC motifs represent maximally diverse routing (3 unique regions within 2 consecutive legs), while AAA motifs indicate routing focused within a single region.

**Fig. 3: Diagrammatic examples of 2-hop motifs and their observed counts over time.**

We examine the absolute counts of different motifs over time, shown in the stacked time series in Fig. 3b. Despite random fluctuations, autocorrelation analysis reveals no clear temporal patterns in any of the motif counts, suggesting that fleet-level routing behaviors remain stable and consistent over time (see Supplementary Fig. 4 for autocorrelation plots). The motifs ranked based on occurrence frequency consistently reveal ABC as the most common motif, followed by ABA, and then AAA. This allows us to aggregate our motif counts across the whole duration of the data without losing critical information.

To examine the relationship between routing patterns and LBR, we compared the motif prevalence between high-performing (top quartile by LBR) and low-performing (bottom quartile by LBR) ships, stratified by vessel class and cargo status (shown in Fig. 4). Within each performance group, prevalence values are calculated as the weighted average of individual ship prevalence rates, where each ship is weighted by its total number of observed motifs; e.g., the weighted average prevalence of laden-first ABC motif among all Panamax ships is 30%.

**Fig. 4: Observed motif prevalence of each ship class based on laden-ballast ratio (LBR) performance rank.**

Our analysis reveals a striking relationship: high-performing ships consistently exhibit a greater prevalence of ABC motifs (visiting three different regions consecutively) and a diminished prevalence of AAA motifs (remaining within the same region). Meanwhile, the inverse is true for low-performing ships. This relationship holds across all ship classes, with top 25% performers displaying notably larger ABC prevalence and correspondingly smaller AAA prevalence compared to the bottom 25% performers (seen by the marker sizes in top 25% versus bot 25% columns).

The cargo status analysis reveals prominent usage of intra-regional legs for ballast operations. In laden → ballast motifs, ABB motifs (where the second leg represents intra-regional ballast movement) are more prevalent than AAB motifs overall, and especially in the top 25% group, consistent with the strategy of minimizing ballast travel times by seeking nearby cargo when available. The reverse applies to the ballast → laden motifs, with increased AAB prevalence for high-performance ships optimizing their cargo pickup location.

We strengthen our findings through comprehensive statistical validation. We systematically control for ship age—distribution shown in Fig. 2c—as a potential confounder and establish the statistical significance of observed trends through multiple approaches. Our stratified analysis by vessel class and motif cargo status employs Beta regression with motif prevalence as the dependent variable and LBR and ship age as regressors, isolating the independent contribution of each predictor (detailed in “Methods” and Supplementary Methods). The regression coefficients quantify each regressor’s unique effect on motif prevalence, with p-values confirming whether these effects are statistically significant. Additionally, we capture the magnitude of group separation using Cliff’s delta δ^29,30, a robust non-parametric effect size that reveals distributional shifts beyond what simple mean comparisons can detect.

The associations between movement patterns and vessel LBR discussed above are statistically significant, show substantial effect sizes, and remain robust after controlling for vessel age. Effect sizes measured by Cliff’s delta, shown in Fig. 5, reveal large positive effects for ABC motifs across all vessel classes, while AAA motifs display small to large negative effect sizes for Panamax, Aframax, and Suezmax vessels. The laden-first ABB and ballast-first AAB motifs also exhibit large effect sizes across all vessel classes. These patterns achieve statistical significance (p < 0.05 after controlling for false discovery rates³¹) with the sole exception of AAA motifs in VLCCs. Critically, adjusting for ship age produces negligible changes in effect sizes (mean absolute difference ∣δ_orig − δ_adj∣ = 0.056; 20 out of 32 comparisons differ by less than 0.05), confirming that the motif-LBR associations represent genuine operational patterns rather than age-related artifacts. Qualitative interpretations of effect magnitude remain consistent whether or not age is explicitly controlled, with only Aframax’s AAA ballast → laden shifting from large to medium effect size after adjustment (Fig. 5).

**Fig. 5: Motif effect size (Cliff’s delta δ) comparing top- and bottom-performers by laden-ballast ratio.**

VLCCs present a slightly unique case due to their operational constraint: unlike other ship classes, VLCCs show minimal AAA usage across both performance groups, reflecting their specialized role in long-haul routes with few intra-regional routes available to them. Instead, high-performing VLCCs show a reduced ABA motif prevalence (representing round-trips between two regions), suggesting that even within their operational constraints, high-performing VLCCs benefit from route diversification rather than repeating the same trading route. While reduced ABA motif prevalence of the top 25% is also observed in other ship classes, their effect sizes are notably smaller than in VLCCs, suggesting a weaker reliance on ABA motifs across the general population of ships (see Supplementary Fig. 5 for VLCC leg duration comparisons of ABC and ABA motifs).

Together, these findings indicate that route diversification represents a key component of operational efficiency, with more efficient ships consistently exhibiting varied market exposure while optimizing ballast destinations through intra-regional movements.

Identifying periodicity in shipping activity

While our motif analysis establishes a connection between individual ship efficiency and route diversification, these individual routing decisions aggregate to create system-level phenomena. The collective movements of many ships, driven by opportunities arising from fluctuations in the crude oil market, manifest as non-trivial temporal signals at the regional level. Although some fluctuations are a result of highly unpredictable events (i.e., geopolitical disruptions or accidents), others may be driven by predictable physical or social phenomena, ranging from climate patterns³² to socioeconomic patterns in production, operation, or consumption^33,34. We therefore shift our focus from ship-centric trajectories to examine the macro-scale cargo flow dynamics that these aggregated movements generate across regions.

These macro-scale dynamics are captured by time-series y(t) of cargo flows (measured in 10,000 deadweight tonnage) entering and leaving a region for each ship class, yielding 8 time series per region. Each unique combination [region, direction, ship class] is referred to as a regional flow segment, indexed with subscript n. To capture the complex dynamics within this collective time series y(t) = [y₁(t), …, y_n(t), …], we employ bagging-optimized DMD^35,36,37, a powerful and noise-robust tool for extracting spatio-temporal patterns from a vector time series, with regional flow segments serving as spatial coordinates. Detailed methodology on the time series construction and DMD reconstructions is in the “Methods” section.

A rank R DMD yields R complex spatial components ϕ_n,j, scalar amplitudes A_j, and a temporal frequency component ω_j, where n is the regional flow segments index, and j is the index of the R modes. The DMD reconstructed time series for region flow segment n can be expressed in the form

$${\hat{y}}_{n}(t)={\sum}_{j}{A}_{j}| {\phi }_{n,j}| \cos \left({\omega }_{j}t+\arg ({\phi }_{n,j})\right).$$

(2)

Despite the noisy data, the DMD method is able to consistently recover periodic oscillatory patterns; examples for the reconstruction ${\hat{y}}_{n}(t)$ for South East Asia are shown in Fig. 6. The reconstruction errors are measured using the normalized root mean squared error RMSE*³⁸, which lies in the range of [0, 1] where 0 represents a perfect signal reconstruction. We see that for RMSE*≤0.5, the reconstructions visually fit the peaks and troughs, although much of the shorter-scale fluctuations are not captured.

**Fig. 6: Time evolution of cargo flow in Southeast Asia by vessel type and flow direction.**

The amplitude A_j is interpreted as the overall contribution of mode j to the dynamics; modes with larger A_j are more important. The spatial components ϕ_n,j represent the individual contributions of the regional flow segments n to mode j, and the complex argument of ϕ_n,j introduces a phase shift that controls when the peaks in each cycle occur. Thus, the combined term A_j∣ϕ_n,j∣ in Eq. (2) is interpreted as the amplitude of the regional flow segment n. The temporal frequencies are more easily understood as cosine oscillation periods τ_j = 2π/ω_j. Note that, since DMD modes come in complex conjugate pairs with identical contributions to the reconstruction, we interpret our results in terms of these pairs.

Our decomposition uncovers a dominant annual cycle that governs global tanker movements. The scalar amplitudes and oscillation periods from the decomposition are shown in Fig. 7, where the mode index j is ordered in decreasing period lengths. The dominant mode-pair exhibits a striking 51.5-week oscillation period with 16% peak-to-trough amplitude variations across regional flows. To strengthen the validity of this finding, we applied two alternative periodicity extraction methods: the Fourier transform and Multi-channel Singular Spectrum Analysis (MSSA)³⁹. Both methods independently recovered dominant periods of 48.0 weeks (95% CI: 47.1−49.9 weeks) via Fourier analysis and 49.5 weeks via MSSA, confirming the validity of our findings (see Supplementary Figs. 9, 10, and 11).

Fig. 7: Amplitude Aj against period τj from dynamic mode decomposition. — **Fig. 7: Amplitude A_j against period τ_j from dynamic mode decomposition.**

This yearly seasonal cycle firmly matches our prior expectations, as the natural seasons shape physical and socio-economic forces that drive the maritime oil trade. There are several direct factors that affect the movement of ships. In particular, the economic supply and demand of crude oil is driven primarily by the transportation sector, with secondary influences from the petrochemical industry and the energy sector^40,41. Analysis of global crude oil consumption³³ also confirms annual seasonality in consumption patterns. Our primary focus lies on interpreting this mode pair; we present the rest of our results in Supplementary Figs. 12–16.

Our results reveal that different ship classes operate with distinct regional footprints, creating specialized seasonal trading patterns across the globe. Analysis of regional contributions to annual seasonality in each ship class, visualized in Fig. 8, identifies where each vessel class concentrates its seasonal operations, with incoming and outgoing flow directions revealing the critical import and export hubs driving global trade cycles. The most significant patterns are described below.

**Fig. 8: Normalized regional amplitudes by ship class and flow direction.**

For incoming flows, VLCCs exhibit the most regional concentration in seasonal amplitude variation, seen in Fig. 8a, with China dominating the annual oscillations. This concentration reflects both the dedicated infrastructure required for the largest vessels and their substantial cargo capacity. These factors draw VLCCs to regions with the highest demand—primarily economically developed areas in the northern hemisphere with port facilities capable of accommodating these massive ships.

The geographical distribution of high incoming flow amplitude regions becomes increasingly dispersed for smaller vessels, reflecting their greater operational flexibility and broader port accessibility. Suezmax vessels (Fig. 8b) maintain strong seasonal amplitudes around China but show their most prominent activity in Southeast Asia, with notable contributions from South America. Aframax vessels (Fig. 8c) demonstrate further geographical spread encompassing much of Asia and Central America, while Panamax vessels (Fig. 8d) exhibit the most even distribution of amplitudes across the globe. The prominence of Western European and US ports for Panamax operations may reflect port size limitations in these regions. The amplitude scale factors (VLCC: 292, Suezmax: 103, Aframax: 100, Panamax: 38) reveal that a significant portion of seasonal demand is supplied by VLCCs, particularly toward China, while Aframax and Panamax vessels support most of the remaining global seasonal demand.

The outgoing flow patterns reveal complementary geographical specializations shaped by both resource distribution and maritime infrastructure constraints. VLCC exports (Fig. 8e) show dominant seasonal amplitudes primarily in the Middle East, followed by South America and Southeast Asia. While the Middle East and South America represent major crude oil production centers, Southeast Asia’s prominence reflects its role as a major crude oil trading and storage hub rather than production, serving as a redistribution centers for the Far East market. Suezmax (Fig. 8f) exports are primarily driven by West Africa, the Mediterranean, and the Middle East—a pattern reflecting how Suezmax vessels, designed as the largest ships capable of Suez Canal transit, can efficiently serve seasonal Asian demand.

Aframax (Fig. 8g) exports remain concentrated in the Middle East, while Panamax (Fig. 8h) exports show peak amplitudes in the western United States, followed by Western Europe, West Africa, and China. The western United States dominance in Panamax exports suggests these vessels transited the Panama Canal eastward to access Atlantic regions⁴². These amplitude distributions demonstrate how size constraints interact with port infrastructure limitations and geographical resource distributions to create distinct seasonal flow patterns across the global shipping network.

Additionally, the DMD analysis on phase shifts uncovers patterns of synchronous and anti-synchronous oscillations between ship classes and between regions, illustrated in Fig. 9. We find that few regions exhibit strong synchronicity across the ship classes, as observed through the clustering (or lack thereof) of markers along rows in Fig. 9. Notably, only China has the phase shift of all ship classes being aligned around a seasonal peak in February/March, or π/2 corresponding to an offset of 3 months, (see Fig. 9a). Other notable regions (Middle East and South East Asia) show moderate synchronicity, where only the VLCC outgoing flows display a distinctly different phase shift, highlighted in Fig. 8b, c.

**Fig. 9: Phase-amplitude of cargo flow for each region.**

We also observe clustering of multiple regions around the start of the calendar year, i.e., February/March (see Fig. 9d), corresponding to the winter months of the Northern hemisphere. Indeed, these include prominent developed economic northern regions in Europe and Asia, which explains their ability to import in large quantities. Interestingly, South East Asia, while not a northern region, is largely in sync with this particular seasonal cycle, potentially due to their strong trading relationship with their northern Asian partners. Meanwhile, contributors to the outgoing flow (such as the Middle East) experience a peak around June/July or −π/2, anti-synchronous to the winter peaks. American seasonal patterns (USG, USW, and USA) do not appear to be in sync with the rest of the northern regions, potentially explained by their increasing independence in crude oil production⁴⁰.

Discussion

We demonstrated that moving beyond time-aggregated static network analysis reveals previously undetected patterns in maritime tanker networks at the individual vessel and fleet level. By applying sequential motif analysis, we uncover links between individual routing strategies and operational efficiency, while DMD shows seasonal spatio-temporal patterns in the dynamic global flow of cargo.

At the individual vessel level, our sequential motif analysis reveals that strategic route diversification correlates with superior operational routing performance across all ship classes. Top-performing ships favor ABC motifs (regional exploration) over AAA motifs (single-region focus) and ABA motifs (repeating round trips), even after controlling for ship age and stratifying by ship class, suggesting genuine strategic advantages beyond economies of scale or vessel-specific technical characteristics. While a definitive causal relationship warrants further investigation, we interpret this correlation as reflecting responsiveness to regional price differentials and cargo availability. Unlike container ships with fixed routes, tankers operate in charter-based markets where strategic flexibility enables exploitation of market opportunities across different regions, often manifesting as ABC motifs. Our findings suggest that fleet operators practicing regional specialization may improve performance through greater strategic diversification and market responsiveness. We note that AAA motifs encompass heterogeneous routing behaviors (i.e., from repetitive shuttles between specific port pairs to diverse movements across multiple ports within a region), which our regional aggregation cannot distinguish; future sub-regional analysis could clarify whether diversification benefits persist at finer spatial scales.

At the system level, our DMD analysis identifies pronounced seasonal patterns in regional cargo flows with quantifiable periods and phase relationships between maritime regions. These seasonalities represent strong underlying structural trends; however, they can also be obscured by short-term fluctuations week-to-week due to non-linear drivers—including market volatility, port congestion, weather delays, or disruptive events—which manifest as high-frequency noise. Our findings should therefore inform strategic planning over longer horizons. For fleet operators, this enables strategic positioning towards regions experiencing increasing seasonal supply or demand, opening opportunities for vessels willing to explore markets beyond their traditional operational areas. For port and governmental authorities, these patterns enable proactive planning for quarterly import/export variations, including timing infrastructure upgrades during off-peak periods, allocating seasonal workforce and berth capacity, and coordinating logistical resources to match anticipated cargo.

These findings establish strong ties between routing performance incentives and routing diversification; the preference for exploration over regional concentration among high-performing vessels provides actionable insights for fleet optimization strategies that can outperform competitors in revenue generation versus costs incurred. Furthermore, our new framework closes methodological gaps explicitly identified by Álvarez et al.⁹, who emphasized the need for “deeper comparison among connected ports in terms of performance indicator” and urged researchers to “check the interactions between economy evolution in different regions and the maritime routes”.

However, several limitations should be acknowledged. In the absence of ship-specific operational data, we constructed a performance indicator using the durations of laden and ballast legs. While these durations are recorded precisely in the data, they include port waiting times stemming from factors outside operator control (e.g., port congestion, administrative delays), which ideally would be isolated to strengthen inferences about routing strategy efficiency. With the availability of additional data—by including route-specific details such as the distance traveled and freight rate realized, ship-specific details such as the average speed and fuel consumption rate, or environmental conditions such as wind speeds or wave heights⁴³—researchers could construct alternative performance indicators or EEOI proxies⁴⁴, including CO2-based indicators, enabling more direct evaluation of individual vessel routing decisions’ environmental impact alongside economic efficiency. Alternatively, such enriched datasets would enable fleet-level comparative analysis through multivariate approaches that simultaneously control for technical, environmental, and operational factors while isolating routing strategy effects.

While the Bagging-Optimized DMD method incorporates sub-sampling to improve robustness against noise, our approach assumes an additive linear seasonal model⁴⁵, and deliberately excludes nonlinear drivers such as oil price shocks, policy changes, and geopolitical tensions or conflicts. Nonetheless, our primary objective was to establish whether underlying seasonal components exist and can be recovered despite system noise—and indeed we successfully demonstrate both. Future research could develop more predictive models that explicitly incorporate disruptive events alongside the baseline seasonal patterns we have established.

Our methodological choices, together with the pre-COVID time frame of our dataset (2016 January to 2020 February), limit the direct generalizability of the specific spatial and temporal patterns we report. Nevertheless, our dual-scale framework is fully deployable on post-COVID data, where comparisons to pre-COVID data can reveal whether major disruptions have fundamentally reshaped seasonal trading behavior or whether the system has reverted to its pre-disruption state.

Our methodology extends beyond oil tanker applications. Sequential motif analysis applies directly to dry bulk carriers and other charter-based shipping or transport sectors with flexible routing patterns. The DMD component could be applied to any transportation system with constrained capacity and regional flows, including trucking networks or airline cargo operations, while the general framework of linking individual behavioral patterns to system-wide efficiency outcomes has broad applicability across complex networked systems.

Methods

Data structure

The dataset contains a list of laden legs (single trips where the ship is loaded with cargo), detailing the origin/destination locations and their respective departure/arrival times, as well as an identifier of the ship that completed the leg. This enables the reconstruction of the entire historical journey of an individual ship, illustrated in Fig. 1, typically consisting of alternating laden and ballast legs. In total, we recovered over 192,000 shipping legs (both laden and ballast), split amongst 33,509 Panamax, 94,840 Aframax, 34,401 Suezmax, and 29,418 VLCC legs.

Ballast leg definition

Ballast legs (where a ship is not carrying cargo) are not explicitly recorded in our dataset; instead, we infer their occurrence between two consecutive laden legs of the same ship: given that a particular ship performs two laden legs from regions a → b and then c → d, this implies a ballast leg from b → c. Consequently, our definition of ballast legs may include other activities not typically related to reaching the next load port. Note that, as the only timestamps recorded are the departure and arrival times of each laden leg, the duration of ballast legs is consequently on average, artificially longer than that of laden legs, as the additional amount of time spent on loading and discharging is not known. Nonetheless, including these port waiting times is logical from an economic perspective, as delays of this type incur further costs to operators without adding to revenues, much like ballast legs do.

Port waiting days adjustment

Our dataset records laden voyage departure and arrival times, but arrival timestamps correspond to berthing rather than port arrival, systematically conflating at-sea time with port waiting time. Since port waiting times vary randomly, this introduces a randomly distributed positive bias that disproportionately affects shorter voyages.

We corrected this systematic bias using a reference dataset with complete temporal information (including separate port arrival and berthing times) covering different vessels and time periods. Analysis of this reference data revealed that port waiting times are approximately independent of voyage duration and appear randomly distributed within each vessel class. Empirically, these distributions are well-characterized by lognormal models, which we fitted using maximum likelihood estimation to obtain vessel-class-specific parameters (μ, σ) in log-space.

To ensure conservative corrections that avoid introducing bias in the opposite direction, we applied lower-bound estimates based on the mode of each lognormal distribution $\exp (\mu -{\sigma }^{2})$. This approach identifies, with 95% confidence, the minimum port waiting time that vessels spend at the end of each laden leg. The conservative adjusted at-sea laden durations are obtained by subtracting the correction values from observed laden leg durations (see Supplementary Fig. 2).

Validation analysis demonstrates that LBR performance rankings remain stable across correction levels: mean rank changes are ≤2 positions for VLCCs, Suezmax, and Panamax, and ≤5 positions for Aframax vessels under median corrections (see Supplementary Fig. 3), with smaller changes for modal correction. For multi-event voyages (multiple loads or multiple discharges), we applied the same correction methodology at intermediate port visits to the total port duration from arrival to departure.

Comparison of laden-ballast ratio to EEOI

Performance indicators are useful tools for evaluating the efficiency of a ship’s operations. The ideal performance indicator would accurately pick out the most efficient ships (e.g., high revenue yield, low emissions rate, low incurred costs, etc.) and the least efficient ships accurately. The International Maritime Organization (IMO) encourages the use of the Energy Efficiency Operational Indicator (EEOI), which acts as an industry standard indicator. The EEOI is defined as

$${{{\rm{EEOI}}}}=\frac{{\sum }_{i}{C}_{i}}{{\sum }_{i}{T}_{i}{D}_{i}},$$

(3)

where C is the carbon emissions, T is the cargo tonnage, D is the distance traveled, and i is the index of a leg (which can be either laden or ballast). The carbon emissions C are calculated as ∑_α F_αc_α, where F_α is the quantity of fuel consumed, and c_α is the CO₂ conversion factor for fuel type α. However, such detailed data are often only available to ship or fleet owners, making direct calculations and comparisons of the EEOI challenging on an industry level. Additionally, the EEOI is particularly sensitive to ship-specific parameters and, as such, is not as useful in comparing ship-independent processes.

To tackle this challenge, we proposed an alternative route performance indicator, the LBR. In our data, the key parameters that are available to us are (i) the load departure time. and (ii) discharge arrival times. Given trajectory J = {(t_i, s_i)} where t_i are port visit times and s_i ∈ {load, discharge}, we define the set of laden legs L and ballast legs B as

$$L=\{({t}_{i},{t}_{i+1}):({s}_{i},{s}_{i+1})=\,{{{\rm{(load,\; discharge)}}}}\},$$

(4)

$$B=\{({t}_{i},{t}_{i+1}):({s}_{i},{s}_{i+1})=({{\rm{discharge}}},\;{{{\rm{load}}}})\}.$$

(5)

Using the definitions from Eqs. (4) and (5), we compute the durations of the laden legs and ballast legs as

$$\,{{{\rm{For}}}}\,l=({t}_{1},{t}_{2})\in L{{{\rm{}}}}:\Delta {t}_{l}={t}_{2}-{t}_{1}-{{{{\rm{PWT}}}}}_{c},$$

(6)

$$\,{{{\rm{For}}}}\,b=({t}_{1},{t}_{2})\in B{{{\rm{}}}}:\Delta {t}_{b}={t}_{2}-{t}_{1},$$

(7)

where PWT_c is the lower-bound estimated port waiting time for vessel class c. The LBR for a given ship is then defined as

$${{{\rm{LBR}}}}=\frac{{\sum }_{l\in L}\Delta {t}_{l}}{{\sum }_{l\in L}\Delta {t}_{l}+{\sum }_{b\in B}\Delta {t}_{b}},$$

(8)

where Δt_l is the at-sea duration of laden leg l and Δt_b is the duration of ballast leg b.

The LBR measures the proportion of time a ship spends carrying cargo during its journey. This measure captures both the revenue-generating aspect of laden durations and the cost-savings associated with shorter ballast durations. By focusing on temporal patterns—time spent laden versus ballast—it isolates the effectiveness of routing decisions independent of vessel-specific factors such as engine efficiency or fuel type. Ships with higher LBR spend more time performing useful work transporting cargo, or waste less time and energy traveling ballast, or both. As the industry increasingly adopts data-driven operations, this measure provides a practical tool for evaluating and comparing route-based performance between ships or fleets across different ownership groups.

Furthermore, the LBR shares similar properties with the EEOI. First, longer laden legs are beneficial. In the EEOI calculations, the average ratio of fuel consumed per unit distance for laden vs. ballast legs is much smaller than the ratio of tonnage carried (ballast legs have 0 useful tonnage carried). Thus, while laden legs emit more CO₂, longer laden legs tend be more efficient in the long run due to the tonnage carried being higher. Second, longer ballast legs are penalized. The ballast leg duration is defined as any time between consecutive laden leg durations. Thus, port waiting times are included in our ballast leg durations. As such, our performance indicator penalizes ballast legs more heavily than the EEOI—this approach reflects a financial perspective where time spent waiting at ports represents the same opportunity cost as time spent traveling ballast. See Supplementary Eqs. (1)–(5) for further details on EEOI compared to LBR.

Sequential motif mining

Sequential motifs represent short recurring patterns within trajectory sequences that capture higher-order dependencies in movement data⁴⁶. Unlike traditional network motifs that focus on structural relationships between ports^{8,13,47,48,49}, sequential motifs analyze the ordered trajectory patterns of individual ships navigating through the maritime network. Selecting an appropriate motif length (i.e., the chosen correlation length) is closely linked to identifying the optimal order in higher-order networks^26,27,28, where ℓ-hop sub-sequences serve as fundamental nodes of the network.

We determined optimal motif length by evaluating higher-order network models at orders 1–4 (beyond which model complexity becomes computationally expensive) using information-theoretic criteria that balance explanatory power against model complexity: Akaike Information Criterion (AIC)⁵⁰, Bayesian Information Criterion (BIC)⁵¹, and likelihood ratio tests⁵². Each of these criteria applies different penalties, offering complementary perspectives on the optimal order selection. Parameters were estimated via maximum likelihood fitting of transition matrices on the full dataset for each vessel class.

Across all vessel classes, both AIC and likelihood ratio tests consistently selected order-2 as optimal (see Supplementary Table 1). While BIC’s stricter complexity penalty favored order-1, the order-2 BIC values remained substantially closer to order-1 than to order-3, indicating that order-2 provides meaningful improvement without excessive parameterization. Intuitively, this means that routing decisions depend on the previous location and current location, but extending knowledge further back adds little explanatory power at the cost of dramatically increasing model complexity. For order ℓ = 2, the higher-order network edges represent 3-node trajectory subsequences, which our 2-hop motifs directly capture. This convergence of independent model selection criteria establishes that 2-hop motifs represent the minimal sequence length necessary to capture statistically significant sequential dependencies in tanker routing patterns.

Having established the optimal length, we now formally define sequential motifs. A sequential motif M of length ℓ is defined as a sequence of alphabets (m₁, m₂, . . . , m_ℓ+1) where ${m}_{i}\in {{{\mathcal{A}}}}$ for a finite alphabet set ${{{\mathcal{A}}}}=\{A,B,C,...\}$. Denoting regions as v_i and the set of regions as V, a regional sequence S = (v₁, v₂, . . . , v_ℓ+1) constitutes an instance of motif M if there exists a unique mapping function $f:V\to {{{\mathcal{A}}}}$ such that f(v_j) = m_j for 1≤j≤ℓ + 1, with the constraint that m_i = m_j ⇒ v_i = v_j. This formulation ensures that motifs capture topological relationships independent of specific regional identities.

At length 2, there are exactly five possible motif types: AAA, ABA, AAB, ABB, and ABC. For example, the sequence (China → Japan → China → SEA → SEA → Japan) contained the following 2-hop motifs: ABA, ABC, ABB, AAB.

The total number of possible motifs of length ℓ follows the Bell numbers, representing unique partitions of ℓ + 1 labels—see https://oeis.org/A000110.

Multivariate motif model

To control for potential confounding by vessel age, we employed multivariate Beta regression for each motif variant across all vessel classes. Beta regression is appropriate for modeling proportional data (e.g., motif prevalence ∈ [0, 1]) where observed values are close to the boundary^53,54. For each motif-class combination, we model motif prevalence ρ_M using Beta regression with logit link:

$$\,{{{\rm{logit}}}}\,(\mu )={\beta }_{0}+{\beta }_{LBR}\cdot {{{\rm{LBR}}}}+{\beta }_{age}\cdot {{{\rm{age}}}},$$

(9)

where $\mu={\mathbb{E}}({\rho }_{M}| {{{\rm{LBR}}}},{{{\rm{age}}}})$ is the conditional mean motif prevalence. The regression yields coefficients β_LBR and β_age quantifying the strength of association between each regressor and motif prevalence ρ_M. Statistical significance was assessed via hypothesis tests for β_LBR ≠ 0 after applying Benjamini-Yekutieli correction across 40 tests (10 motif variants × 4 vessel classes)³¹.

To isolate age effects from routing performance effects, we computed age-adjusted LBR residuals by regressing LBR against vessel age. These residuals capture the performance variation unexplained by age alone. Complete results on the Beta regression coefficients, p-values, and age-adjusted effect sizes are detailed in Supplementary Methods and Supplementary Table 2.

Motif mining algorithm

For each vessel journey S = (v₁, v₂, . . . , v_n+1) representing a sequence of regions visited, we first extract all consecutive 3-region subsequences S_i:i+2 = (v_i, v_i+1, v_i+2) for i = 1, . . . , n − 1, and then apply the topological mapping function to classify each subsequence. Letting κ_M(S) = count of motif M in S, for a collection of journeys S_i, the prevalence of motif M is calculated as

$${\rho }_{M}=\frac{{\sum }_{i}{\kappa }_{M}({S}_{i})}{{\sum }_{i}(| {S}_{i}| -2)}.$$

(10)

Dynamic mode decomposition for shipping dynamics

To uncover coherent patterns in our regional shipping data, we employ the DMD method³⁵, a powerful data-driven approach to studying dynamics of spatio-temporal data. DMD has been extended from its origins in fluid dynamics to a range of network data analysis—such as disease spreading dynamics and brain dynamics^55,56,57—yet remains unexplored in maritime network analysis despite its suitability for spatiotemporal pattern extraction in constrained systems.

DMD is able to treat spatial and temporal modes simultaneously, making it particularly effective for identifying system-wide trends, something that can be leveraged in networks where the underlying topological connections act as a latent space for the spatial modes. We specifically use bagging-optimized DMD^36,37—implemented in Python with the PyDMD package^58,59—which improves robustness against noisy data through ensemble averaging. We provide a summary of the DMD output next; see refs. ^36,37 for full details and theory.

Given a vectorized time series ${{{\bf{Y}}}}=\left[{{{\bf{y}}}}(1),{{{\bf{y}}}}(2),\cdots \,,{{{\bf{y}}}}(m)\right]$ over N regional flow segments and m time steps, the R rank DMD finds a complex-valued reconstruction

$$\hat{{{{\bf{y}}}}}(t)={\sum}_{j=1}^{R}{A}_{j}{{{{\mathbf{\Phi }}}}}_{j}\exp (i{\omega }_{j}t)$$

(11)

where A_j is a scalar amplitude, ${{{{\mathbf{\Phi }}}}}_{j}=({{\upphi }}_{1,j},{\upphi }_{2,j},\ldots,{\upphi }_{N,j})\in {{\mathbb{C}}}^{N}$ is a complex vector, and ${\omega }_{j}\in {\mathbb{R}}$ is the frequency of the j^th mode. The A_j, Φ_j, and ω_j terms are parameters to be fitted. In this case, the real part of $\hat{{{{\bf{y}}}}}$ is the best-fitted reconstruction of the data obtained by minimizing the Frobenius norm³⁷.

We can decompose the terms inside the right-hand side of Eq. (11) into more intuitive forms. In particular, the real component of the n^th region of the j^th mode can be written as

$$\Re [{A}_{j}{\upphi }_{n,j}\exp (i{\omega }_{j}t)]={A}_{j}| {\upphi }_{n,j}| \cos (2\pi t/{\tau }_{j}+{\theta }_{n,j}),$$

(12)

where θ_n,j is obtained from the polar form of the complex number ${\upphi }_{n,j}=| {\upphi }_{n,j}| \exp (i{\theta }_{n,j})$, and τ_j = 2π/ω_j is the period.

In order to apply DMD to our study, we must first get our shipping data in the form of y(t). Our data comes in the form of a set of laden legs. For a given vessel class, a laden leg describes a voyage from region u to region v between the times t_u and t_v with cargo weight w. This can be represented as a temporal edge (u, v, t_u, t_v). We then define a time step δt and rolling window interval ΔW, where t_i+1 = t_i + δt. A temporal edge (u, v, t_u, t_v) is then considered active in interval t + ΔW if

$$t\le {t}_{u}\le t+\Delta W\,\,{{{\rm{edge}}}}\; {{{\rm{begins}}}}\; {{{\rm{in}}}}\; {{{\rm{window}}}}$$

(13)

$$t\le {t}_{v}\le t+\Delta W\,\,{{{\rm{edge}}}}\; {{{\rm{ends}}}}\; {{{\rm{in}}}}\; {{{\rm{window}}}}$$

(14)

$${t}_{u}\le t\,\;{{{\rm{and}}}}\,\; {t}_{v}\ge t+\Delta W\,\,{{{\rm{edge}}}}\; {{{\rm{overlaps}}}}\; {{{\rm{window}}}}$$

(15)

Denoting the set of edges active in window t + ΔW as ${{{\mathcal{E}}}}(t,\Delta W)$, the time series of incoming or outgoing cargo flow for region i in time window t is the sum of the cargo weights w_e of all the active voyages $e\in {{{\mathcal{E}}}}(t,\Delta W)$:

$${y}_{(i,in)}(t)={\sum }_{(u,v,{t}_{u},{t}_{v})=e\in {{{\mathcal{E}}}}(t,\Delta W)}{\delta }_{i,u}{w}_{e}$$

(16)

$${y}_{(i,out)}(t)={\sum }_{(u,v,{t}_{u},{t}_{v})=e\in {{{\mathcal{E}}}}(t,\Delta W)}{\delta }_{i,v}{w}_{e}$$

(17)

where δ_i,u = 1 if i = u else 0 and w_e is in units of 10,000 deadweight tonnage (dwt). By definition then, ballast legs are not included. Thus, y_(i, in)(t) is the total cargo arriving at region i at time t, and y_(i, out)(t) is the total cargo departing from region i at time t.

An additional pre-processing step is to remove linear trends from y(t), as the DMD method can only recover sinusoidal and exponential terms. While the DMD algorithm may still converge without this step, it tends to fit eigenmodes with period ≫ the duration of the data set to approximate a linear term. This can obscure the functionality and interpretability of results, and thus it is better to remove it beforehand. We apply a linear regression in time to each time series, such that

$${\widetilde{y}}_{i}(t)={y}_{i}(t)-{r}_{i}t-{k}_{i}$$

(18)

where r_i and k_i are gradient and constant terms from a least-squares straight line fit. This brings our work in line with an additive seasonal model⁴⁵, where the original signal ${{{\bf{y}}}}({{t}})={{{\bf{L}}}}(t)+{{{\bf{I}}}}(t)+\widetilde{{{{\bf{y}}}}}(t)$, where L(t) is a long-term linear trend, and I(t) are irregular terms. The component $\widetilde{{{{\bf{y}}}}}(t)$ is composed of both the “seasonal” and “cyclical” terms, which in ref. ⁴⁵ corresponds to periodicities of <1 year and >1 year, respectively. However, in our case, we do not distinguish between the two.

Peak-to-trough relative amplitudes

To quantify the relative significance of an individual mode to each time-series, we need a standardized measure that accounts for the varying scales of cargo flows across different regional flow segments. To do this, we define the peak-to-peak relative amplitude, which normalizes each mode’s peak-to-peak amplitude by the total variability observed in the original time series:

$${{{{\rm{Relative\; Amplitude}}}}}_{n,j}=\frac{2{A}_{j}| {\upphi }_{n,j}| }{\max ({y}_{n})-\min ({y}_{n})}$$

(19)

where A_j∣ϕ_n,j∣ represents the amplitude of mode j in regional flow segment n, and $\max ({y}_{n})-\min ({y}_{n})$ is the peak-to-peak range of the original time series for regional flow segment n. The factor of 2 converts the cosine amplitude to peak-to-peak amplitude. An additional factor of 2 is introduced when considering the relative amplitude of a mode pair, since DMD modes come in complex conjugate pairs that both contribute to the reconstructed signal.

A larger value in the relative peak-to-peak amplitude indicates that the mode contributes more significantly to the observed fluctuations in that regional flow segment. The standardized nature of this measure also enables calculation of the average significance of each mode across all regional flow segments.

Choice of parameters

DMD requires the selection of several key parameters: SVD rank, time interval, rolling window duration, and number of delay embeddings. We employed a two-stage selection strategy utilizing error-based optimization (for SVD rank) and stability-based selection (for other temporal parameters) to avoid overfitting to data-specific noise.

For time interval (1 week), rolling window duration (6 weeks), and delay embeddings (22), we ensured that the parameter values chosen were statistically stable under partial variation. We systematically scanned parameter ranges while computing eigenvalue distributions for each combination, then calculated pairwise statistical distances (variational distance and Hellinger distance) between eigenvalue distributions across parameter sweeps⁶⁰. Regions of low statistical distance correspond to stable parameter regimes where extracted eigenvalues remain consistent across parameter perturbations (see Supplementary Fig. 8). For time intervals and rolling window duration, we selected small parameter values to balance eigenvalue stability with maximizing the number of time points available for decomposition. For delay embeddings, we selected the smallest value within the stable regime that ensures the data matrix is tall (more rows than columns), which is required for stable SVD decomposition.

We determined the SVD rank through a two-step process. First, we examined the explained variance, identifying an “elbow point” where the explained variance plateaus. Then we evaluated the root-mean-squared error (RMSE) around this elbow point, corresponding to candidate ranks 6−20. The final rank was chosen such that it provided the largest average marginal improvement in RMSE across a range of delay-embedding (see Supplementary Fig. 6). SVD rank 12 provided the largest average marginal improvement, while higher ranks showed inconsistent performance and diminishing returns. This approach balances reconstruction performance against model complexity to prevent overfitting to noisy signals.

The final parameter choices produced stable modes and eigenvalues that were well within the expected ranges under neighboring parameter choices. In particular, across all parameter combinations, we recovered the dominant annual mode with period 362 ± 10 days. Similar results were found for the other 5 modes; see Supplementary Fig. 7.

Reconstruction errors

The reconstruction errors for the τ = 51.5 mode was evaluated for each combination of region, direction, and ship class using the normalized root mean squared error RMSE*³⁸, shown in the scatterplot in Fig. 10. The RMSE* lies within the range of [0, 1], with 0 indicating a perfect fit and 1 a maximally erroneous fit. We find that regions with larger amplitudes tend to have lower RMSE*, indicating that our interpretations should be focused on high amplitude regions.

**Fig. 10: Scatter plot of reconstruction error against amplitude of the dominant mode-pair.**

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The vessel voyage data that support the findings of this study are available from AlphaOcean, but restrictions apply to the availability of these data, which were used under license for the current study due to commercial confidentiality agreements, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of AlphaOcean (https://alphaocean.ai/). Map figures were generated with data from the coastal data by GSHHG.

Code availability

No substantial custom code was developed for this study. Sequential motif analysis was performed using the RandomWalker package (accessible via https://github.com/narnolddd/randomwalker). Dynamic mode decomposition was performed using PyDMD version 1.0.0, and singular spectrum analysis was conducted using py_ssa_lib version 0.0.1. Statistical analysis was performed in Python version 3.9.19 using standard Python libraries and packages. All libraries are freely available and can be installed via standard Python package managers.

References

United Nations Conference on Trade and Development. Review of maritime transport 2017 https://unctad.org/publication/review-maritime-transport-2017 (2017).
U.S. Energy Information Administration. World oil transit chokepoints https://www.eia.gov/international/analysis/special-topics/World_Oil_Transit_Chokepoints (2024).
Rahim, M. M., Islam, M. T. & Kuruppu, S. Regulating global shipping corporations’ accountability for reducing greenhouse gas emissions in the seas. Mar. Policy 69, 159–170 (2016).
Article Google Scholar
Brancaccio, G., Kalouptsidi, M. & Papageorgiou, T. Geography, transportation, and endogenous trade costs. Econometrica 88, 657–691 (2020).
Article Google Scholar
Walker, T. R. et al. Chapter 27—environmental effects of marine transportation. In 2nd edn World Seas: An Environmental Evaluation, (ed. Sheppard, C.) 505–530 (Academic Press, 2019).
Shi, Z. et al. Perspectives on shipping emissions and their impacts on the surface ocean and lower atmosphere: an environmental-social-economic dimension. Elementa Sci. Anthr. 11, 00052 (2023).
Article Google Scholar
Ducruet, C. The geography of maritime networks: a critical review. J. Transp. Geogr. 88, 102824 (2020).
Article Google Scholar
Kaluza, P., Kölzsch, A., Gastner, M. T. & Blasius, B. The complex network of global cargo ship movements. J. R. Soc. Interface 7, 1093–1103 (2010).
Article PubMed PubMed Central Google Scholar
Álvarez, N. G., Adenso-Díaz, B. & Calzada-Infante, L. Maritime traffic as a complex network: a systematic review. Netw. Spat. Econ. 21, 387–417 (2021).
Article Google Scholar
Faure, M.-A. & Ducruet, C. The Blue Connection. A Systematic and Critical Review of Shipping Network Research. Netw. Spat. Econ. https://doi.org/10.1007/s11067-025-09705-y (2025).
Ducruet, C. Shipping network analysis: state-of-the-art and application to the global financial crisis. In Port Systems in Global Competition, 300–333 (Routledge, 2023).
Hu, Y. & Zhu, D. Empirical analysis of the worldwide maritime transportation network. Phys. A Stat. Mech. Appl. 388, 2061–2071 (2009).
Article Google Scholar
Ge, J., fu, Q., Zhang, Q. & Wan, Z. Regional operating patterns of world container shipping network: a perspective from motif identification. Phys. A Stat. Mech. Appl. 607, 128171 (2022).
Article Google Scholar
Peng, P., Poon, J. P., Yang, Y., Lu, F. & Cheng, S. Global oil traffic network and diffusion of influence among ports using real time data. Energy 172, 333–342 (2019).
Article Google Scholar
Liu, Q., Yang, Y., Ke, L. & Ng, A. K. Structures of port connectivity, competition, and shipping networks in Europe. J. Transp. Geogr. 102, 103360 (2022).
Article Google Scholar
Zhang, Q., Pu, S., Luo, L., Liu, Z. & Xu, J. Revisiting important ports in container shipping networks: a structural hole-based approach. Transp. Policy 126, 239–248 (2022).
Article Google Scholar
Ducruet, C. & Notteboom, T. The worldwide maritime network of container shipping: spatial structure and regional dynamics. Glob. Netw. 12, 395–423 (2012).
Article Google Scholar
González Laxe, F., Jesus Freire Seoane, M. & Pais Montes, C. Maritime degree, centrality and vulnerability: port hierarchies and emerging areas in containerized transport (2008-2010). J. Transp. Geogr. 24, 33–44 (2012).
Article Google Scholar
Guerrero, D., Letrouit, L. & Pais-Montes, C. The container transport system during covid-19: an analysis through the prism of complex networks. Transp. Policy 115, 113–125 (2022).
Article Google Scholar
Asgari, N., Farahani, R. Z. & Goh, M. Network design approach for hub ports-shipping companies competition and cooperation. Transp. Res. Part A Policy Pract. 48, 1–18 (2013).
Article Google Scholar
Seebens, H., Schwartz, N., Schupp, P. J. & Blasius, B. Predicting the spread of marine species introduced by global shipping. Proc. Natl. Acad. Sci. USA 113, 5646–5651 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Seebens, H., Gastner, M. T. & Blasius, B. The risk of marine bioinvasion caused by global shipping. Ecol. Lett. 16, 782–790 (2013).
Article CAS PubMed Google Scholar
Yan, Z. et al. Analysis of global marine oil trade based on automatic identification system (ais) data. J. Transp. Geogr. 83, 102637 (2020).
Article Google Scholar
Peng, P., Yang, Y., Cheng, S., Lu, F. & Yuan, Z. Hub-and-spoke structure: characterizing the global crude oil transport network with mass vessel trajectories. Energy 168, 966–974 (2019).
Article Google Scholar
International Maritime Organization. Guidelines for voluntary use of the ship energy efficiency operational indicator (eeoi). Circular MEPC.1/Circ.684, International Maritime Organization, London (2009). https://www.cdn.imo.org/localresources/en/OurWork/Environment/Documents/Circ-684.pdf.
Xu, J., Wickramarathne, T. L. & Chawla, N. V. Representing higher-order dependencies in networks. Sci. Adv. 2, e1600028 (2016).
Article ADS PubMed PubMed Central Google Scholar
Teo, K., Arnold, N., Hone, A. & Kiss, I. Z. Performance of higher-order networks in reconstructing sequential paths: from micro to macro scale. J. Complex Netw. 13, cnae050 (2025).
Article MathSciNet Google Scholar
Scholtes, I. When is a network a network? Multi-order graphical model selection in pathways and temporal networks. In Proc. 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 1037–1046 (ACM, 2017).
Cliff, N. Dominance statistics: ordinal analyses to answer ordinal questions. Psychol. Bull. 114, 494–509 (1993).
Article Google Scholar
Hess, M. R. & Kromrey, J. D. Robust confidence intervals for effect sizes: a comparative study of cohen’sd and Cliff’s delta under non-normality and heterogeneous variances. in Annual Meeting of the American Educational Research Association, vol. 1 (Citeseer, 2004).
Benjamini, Y. & Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).
Article MathSciNet Google Scholar
Stopa, J. E. & Cheung, K. F. Periodicity and patterns of ocean wind and wave climate. J. Geophys. Res. Oceans 119, 5563–5584 (2014).
Article ADS Google Scholar
Inchauspe, J., Li, J. & Park, J. Seasonal patterns of global oil consumption: implications for long term energy policy. J. Policy Model. 42, 536–556 (2020).
Article Google Scholar
Raju, T. B., Chauhan, P., Tiwari, S. & Kashav, V. Seasonality in freight rates. J. Int. Logist. Trade 18, 149–157 (2020).
Article Google Scholar
Schmid, P. J. Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech. 656, 5–28 (2010).
Article ADS MathSciNet CAS Google Scholar
Askham, T. & Kutz, J. N. Variable projection methods for an optimized dynamic mode decomposition. SIAM J. Appl. Dyn. Syst. 17, 380–416 (2018).
Article MathSciNet Google Scholar
Sashidhar, D. & Kutz, J. N. Bagging, optimized dynamic mode decomposition for robust, stable forecasting with spatial and temporal uncertainty quantification. Philos. Trans. R. Soc. A 380, 20210199 (2022).
Article ADS MathSciNet Google Scholar
Müller-Plath, G. & Lüdecke, H.-J. Normalized coefficients of prediction accuracy for comparative forecast verification and modeling. Res. Stat. 2, 2317172 (2024).
Article Google Scholar
Hassani, H. & Mahmoudvand, R. Multivariate singular spectrum analysis: a general view and new vector forecasting approach. Int. J. Energy Stat. 1, 55–83 (2013).
Article Google Scholar
U.S. Energy Information Administration. Monthly energy review September 2024. Tech. Rep., U.S. Department of Energy. https://www.eia.gov/totalenergy/data/monthly/. Accessed: 2024-08-27 (2024).
Wu, X. & Chen, G. Global overview of crude oil use: from source to sink through inter-regional trade. Energy Policy 128, 476–486 (2019).
Article Google Scholar
U.S. Energy Information Administration. Petroleum supply annual, volume 1. Tech. Rep., U.S. Department of Energy. https://www.eia.gov/petroleum/supply/annual/volume1/. Accessed: 2024-09-27 (2024).
Chen, X. et al. Intelligent ship route planning via an a^* search model enhanced double-deep q-network. Ocean Eng. 327, 120956 (2025).
Article Google Scholar
Chen, X. et al. Ship energy consumption analysis and carbon emission exploitation via spatial-temporal maritime data. Appl. Energy 360, 122886 (2024).
Article CAS Google Scholar
Hylleberg, S. Seasonality in Regression (Academic Press, 2014).
LaRock, T., Xu, M. & Eliassi-Rad, T. A path-based approach to analyzing the global liner shipping network. EPJ Data Sci. 11, 18 (2022).
Article Google Scholar
Si, R., Jia, P., Li, H. & Zhao, X. Assessing the structural resilience of the global crude oil maritime transportation network: a motif-based approach from network to ports. J. Transp. Geogr. 123, 104123 (2025).
Article Google Scholar
Xu, M., Deng, W., Zhu, Y. & LÜ, L. Assessing and improving the structural robustness of global liner shipping system: a motif-based network science approach. Reliab. Eng. Syst. Saf. 240, 109576 (2023).
Article Google Scholar
Wei, X. et al. Resilience analysis of container port networks based on motif dynamics. In 2023 7th International Conference on Transportation Information and Safety (ICTIS), 263–266 (IEEE, 2023).
Akaike, H. Information theory and an extension of the maximum likelihood principle. In Selected papers of hirotugu akaike, 199–213 (Springer, 1998).
Neath, A. A. & Cavanaugh, J. E. The bayesian information criterion: background, derivation, and applications. Wiley Interdiscip. Rev. Comput. Stat. 4, 199–203 (2012).
Article Google Scholar
Wilks, S. S. The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Stat. 9, 60–62 (1938).
Article Google Scholar
Ferrari, S. & Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 31, 799–815 (2004).
Article MathSciNet Google Scholar
Geissinger, E. A., Khoo, C. L., Richmond, I. C., Faulkner, S. J. & Schneider, D. C. A case for beta regression in the natural sciences. Ecosphere 13, e3940 (2022).
Article Google Scholar
Proctor, J. L. & Eckhoff, P. A. Discovering dynamic patterns from infectious disease data using dynamic mode decomposition. Int. Health 7, 139–145 (2015).
Article PubMed PubMed Central Google Scholar
Griffith, T. D. & Hubbard Jr, J. E. System identification methods for dynamic models of brain activity. Biomed. Signal Process. Control 68, 102765 (2021).
Article Google Scholar
Brunton, B. W., Johnson, L. A., Ojemann, J. G. & Kutz, J. N. Extracting spatial-temporal coherent patterns in large-scale neural recordings using dynamic mode decomposition. J. Neurosci. Methods 258, 1–15 (2016).
Article PubMed Google Scholar
Demo, N., Tezzele, M. & Rozza, G. Pydmd: Python dynamic mode decomposition. J. Open Source Softw. 3, 530 (2018).
Article ADS Google Scholar
Ichinaga, S. M. et al. Pydmd: a Python package for robust dynamic mode decomposition. J. Mach. Learn. Res. 25, 1–9 (2024).
Google Scholar
Vaart, A. W. v. d. Relative Efficiency of Tests. Cambridge Series in Statistical and Probabilistic Mathematics (Cambridge University Press, 1998).

Download references

Acknowledgements

K.T. acknowledges the PhD studentship support from Northeastern University.

Author information

Authors and Affiliations

Network Science Institute, Northeastern University London, London, UK
Kevin Teo, Naomi Arnold & István Z. Kiss
School of Engineering, Mathematics & Physics, University of Kent, Canterbury, UK
Andrew Hone
AlphaOcean.ai Ltd, Brighton, East Sussex, UK
Michael Coulon & Martin Ireland
Machine Intelligence Group for the Betterment of Health and the Environment, Network Science Institute, Northeastern University, Boston, MA, USA
Mauricio Santillana
Department of Mathematics, Northeastern University, Boston, MA, USA
István Z. Kiss

Authors

Kevin Teo
View author publications
Search author on:PubMed Google Scholar
Naomi Arnold
View author publications
Search author on:PubMed Google Scholar
Andrew Hone
View author publications
Search author on:PubMed Google Scholar
Michael Coulon
View author publications
Search author on:PubMed Google Scholar
Martin Ireland
View author publications
Search author on:PubMed Google Scholar
Mauricio Santillana
View author publications
Search author on:PubMed Google Scholar
István Z. Kiss
View author publications
Search author on:PubMed Google Scholar

Contributions

K.T., N.A., M.C., M.I., and I.Z.K. designed research; K.T. and N.A. performed research; M.S. contributed analytic tools; K.T., N.A., A.H., M.C., M.S., and I.Z.K. analyzed data; K.T., N.A., A.H., M.C., M.S., and I.Z.K. wrote the paper.

Corresponding authors

Correspondence to Kevin Teo or István Z. Kiss.

Ethics declarations

Competing interests

K.T., N.A., A.H., and M.S. declare no competing interests. M.I. and M.C. are employees of a shipping-related company, AlphaOcean.ai. M.I., M.C., and I.Z.K. own shares in AlphaOcean.ai. The start-up company’s core product is building optimization models to assist shipowners with refueling strategies, which are not related to the research in this paper.

Peer review

Peer review information

Nature Communications thanks Shaobo Wang, Wen-Long Shang, and the other anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Teo, K., Arnold, N., Hone, A. et al. Unveiling individual and collective temporal patterns in the tanker shipping network. Nat Commun 17, 3300 (2026). https://doi.org/10.1038/s41467-026-70013-1

Download citation

Received: 30 January 2025
Accepted: 16 February 2026
Published: 27 February 2026
Version of record: 09 April 2026
DOI: https://doi.org/10.1038/s41467-026-70013-1