Efficient curve fitting with penalized B-splines for oceanographic and ecological applications

Bak, Kwan-Young; Lee, Dong-Young; Lee, Ju-Seong; Jee, Hee-Jung; Park, R. Jisung; Koo, Ja-Yong; Jhong, Jae-Hwan

doi:10.1038/s41598-025-05779-3

Download PDF

Article
Open access
Published: 01 July 2025

Efficient curve fitting with penalized B-splines for oceanographic and ecological applications

Kwan-Young Bak^1,2,
Dong-Young Lee³^na1,
Ju-Seong Lee³^na1,
Hee-Jung Jee³,
R. Jisung Park⁴,
Ja-Yong Koo⁵ &
…
Jae-Hwan Jhong³

Scientific Reports volume 15, Article number: 21958 (2025) Cite this article

675 Accesses
Metrics details

Subjects

Abstract

This study introduces a penalized B-spline approach for estimating smooth curves, incorporating a total variation penalty to balance flexibility and interpretability. By leveraging group penalties and the Alternating Direction Method of Multipliers (ADMM) algorithm, the method ensures consistency across response variables and computational efficiency. We applied this approach to two real-world datasets: oceanographic drifter data in the Niño 4 region and Demoiselle Crane migration data. The fitted trajectories closely captured both large-scale trends and localized variations, demonstrating robustness against noise and irregularly sampled data. This framework is particularly advantageous for analyzing spatiotemporal data, as it effectively removes unnecessary knots and adapts to the complexity of underlying patterns. The total variation penalty controls curve smoothness by penalizing abrupt changes in the estimated function, while the group penalty ensures that all response variables share a consistent set of knots, enhancing interpretability. Although this study focused on two-dimensional spatial trajectories, the methodology is designed for general p-dimensional data and can be extended to three-dimensional datasets, such as avian flight paths or marine animal diving behaviors. Future research could refine the approach by dynamically selecting penalty parameters or expanding its applicability to broader multidimensional settings. This robust and adaptable technique provides a practical tool for analyzing complex spatiotemporal data across various scientific disciplines.

Disentangling structural and functional responses of native versus alien communities by canonical ordination analyses and variation partitioning with multiple matrices

Article Open access 27 July 2022

Genetic analyses reveal panmixia in Indian waters and population subdivision across Indian Ocean and Indo-Malay Archipelago for Decapterus russelli

Article Open access 21 December 2023

Rethinking composite quantification by capturing biological and ecological diversity across multiple dimensions

Article Open access 30 July 2025

Introduction

Curve fitting is a fundamental task in statistical modeling and data analysis, with applications spanning diverse fields such as physics, engineering, economics, and biology. Various methods have been developed over the years to address the challenges of fitting curves to observed data, including polynomial regression, splines, and kernel-based methods. B-splines, in particular, have emerged as a popular approach due to their flexibility and ability to provide smooth approximations while maintaining local control^{1,2,3,4,5,6,7}. The use of B-splines in curve fitting has been extensively studied, and numerous advancements have been made to enhance their efficiency and robustness^5,6,7,8,9.

Penalized regression is a powerful tool in statistical modeling, where additional penalty terms are incorporated into the loss function to control the complexity of the fitted model. This helps prevent overfitting and ensures the smoothness or sparsity of the model, depending on the context. The penalty terms typically impose constraints on the model parameters, ensuring that the resulting estimates exhibit desirable properties. One popular form of penalization for promoting smoothness in regression is the total variation penalty, which limits the roughness of the estimated function^10,11. Penalized regression techniques, such as the lasso and ridge regression, have also been combined with B-splines to control model complexity and prevent overfitting^2,12. Recent developments have focused on improving the interpretability of fitted curves and extending these methods to high-dimensional data settings^3,13,14.

The generalized lasso extends the classical lasso framework by incorporating a flexible penalty structure that can include not only simple sparsity but also penalties based on differences between coefficients, such as the total variation penalty^15,16. This flexibility is particularly valuable when dealing with structured data, and it has been shown to be effective in capturing spatial and temporal dependencies in various applications. The penalized B-spline approach discussed in this paper is a specific instance of this broader generalized lasso framework.

The Alternating Direction Method of Multipliers (ADMM) algorithm is an optimization technique that has gained significant popularity in recent years, especially for problems involving separable objectives and complex constraints. ADMM decomposes complex optimization problems into simpler subproblems that can be solved iteratively^17,18. It has been effectively used for a wide range of statistical learning problems, including those involving lasso and generalized lasso penalties^15,17,19,20. Previous research has shown that ADMM is particularly well-suited for solving lasso and generalized lasso problems due to its ability to handle non-differentiable penalty terms efficiently^19,20. These studies demonstrate the robustness and scalability of ADMM in high-dimensional settings, making it an ideal choice for penalized regression problems. The introduction of efficient optimization algorithms, such as the Alternating Direction Method of Multipliers (ADMM), has further contributed to the scalability of these methods in practical applications^17,18.

Group lasso is another extension of the lasso method, designed to handle situations where predictors can be naturally grouped. This method encourages entire groups of coefficients to be either included or excluded from the model, which is particularly useful when predictors exhibit group-wise sparsity^21,22. Group lasso is closely related to the idea of penalizing differences in B-spline coefficients to ensure smoothness, where the goal is to control the overall structure of the estimated model. Recent research has demonstrated the effectiveness of ADMM for solving group lasso problems. In particular,²³ presents an ADMM-based framework for efficiently solving group lasso problems, showing that ADMM can decompose the group-wise penalty structure into manageable subproblems while ensuring convergence. This approach is particularly advantageous for high-dimensional data, as it provides scalability and robustness in handling group-wise sparsity.

Existing nonparametric function smoothing techniques, such as smoothing splines²⁴, trend filtering^25,26, generalized additive models²⁷, and kernel smoothing²⁸, typically fit p-dimensional response variables by separately applying univariate smoothers to each coordinate before integrating the results. While effective in many cases, these methods lack the ability to capture interdependencies between response coordinates, leading to potential inefficiencies in capturing complex functional relationships. Our proposed penalized B-spline approach, incorporating a group penalty, ensures that all response coordinates share the same set of knots, thereby improving computational efficiency and stabilizing variance across dimensions. This framework provides a principled method for high-dimensional curve estimation, effectively addressing the limitations of traditional smoothing approaches while leveraging the benefits of penalized regression models.

Oceanography has long benefited from advancements in data collection and modeling techniques, enabling a deeper understanding of complex marine processes. The advent of real-time drifter datasets, such as those provided by the Global Drifter Program (GDP), has revolutionized the study of ocean surface currents, sea surface temperatures, and their interactions with climatic phenomena like El Niño and La Niña events (NOAA/AOML, 2024). Drifters, equipped with high-precision sensors, offer an unprecedented view into the spatiotemporal variability of oceanographic features, making them invaluable for validating circulation models and predicting climate anomalies^29,30.

In this study, we leverage penalized B-spline curve fitting to reconstruct drifter trajectories with an emphasis on generalizing ocean current patterns observed from similar locations. Due to the inherent variability of oceanographic processes, drifters released from nearly identical positions can follow markedly different paths influenced by a range of factors. Our approach synthesizes these diverse trajectories into a coherent, generalized representation of regional ocean currents, thus offering a robust framework for analyzing complex marine processes and discerning the underlying flow structures. One application of this method is the rapid construction of ocean current schematic diagrams that succinctly summarize the generalized current patterns. Moreover, when combined with additional information such as swell waves, wind waves, and salinity, the integrated interpretation of the generalized currents becomes much more straightforward, enabling a comprehensive understanding of the interplay between various oceanographic phenomena. This methodology is demonstrated in the Niño 4 region, a pivotal area for studying equatorial current dynamics and trade wind interactions, where the generalized current pattern provides deeper insights into both simplified schematic representations and integrated analyses that consider multiple environmental variables^31,32.

Ecology, as a field, increasingly relies on advanced analytical techniques to unravel the complexities of species behavior and their interactions with the environment. Bird migration studies, for instance, offer critical insights into ecological dynamics, behavioral adaptations, and the impacts of climate change on biodiversity^33,34. The Demoiselle Crane, a species renowned for its extensive seasonal migrations, presents unique challenges for analysis due to irregular GPS tracking intervals and noise in the recorded data^35,36. It is crucial to accurately estimate reliable migration routes by accounting for these characteristics, as this is essential for establishing appropriate protected areas and optimizing habitat management strategies for migratory birds³⁷. In the ecological context, our study applies penalized B-spline curve fitting to estimate migration trajectories based on GPS tracking data from Demoiselle Cranes. The estimated trajectories closely align with those reported in previous studies, demonstrating the reliability of our approach. In large-scale GPS tracking data, noise is frequently present due to the extensive range of movement. A key advantage of our method is its ability to effectively handle noisy data while still producing results comparable to those derived from noise-free datasets. Previous studies primarily relied on high-precision measurements, excluding data with high levels of noise³⁸. However, our approach successfully incorporates noisy data without compromising accuracy, enabling a more cost-effective analysis of migration trajectories. Ultimately, our findings suggest that this methodology reduces costs while maintaining performance even in the presence of noise, providing a basis for its application to other avian species.

This paper makes three primary contributions. First, we propose a curve fitting approach using B-splines with a total variation penalty, which effectively controls the smoothness of the fitted curve. Second, we introduce group penalties to ensure that all response variables share the same set of knots, facilitating both functional smoothness and interpretability. Finally, we apply the ADMM algorithm to efficiently solve the penalized optimization problem, providing an effective and scalable solution for the proposed method. In this paper, we discuss penalized B-spline curve estimation by employing the ADMM algorithm to optimize the penalized objective function. Our approach draws on the principles of generalized lasso, total variation penalties, and other related methodologies to achieve a balance between model fit and smoothness, ensuring a flexible yet interpretable curve fitting framework.

Model and estimator

B-spline curve model

Consider a p-dimensional curve model

$$\begin{aligned} y_i = f(t_i) + \varepsilon _i \quad \text{ for }\quad i = 1, \ldots , n, \end{aligned}$$

(1)

where $y_i = (y_{i1}, \ldots , y_{ip}) \in \mathbb {R}^p$ are p-dimensional response data, $t_i \in [a, b]$ are fixed input points, $\varepsilon _i$ are independent errors with mean zero and variance $\sigma ^2 I_p$ with $\sigma > 0$. The goal of this study is to estimate the underlying curve f given space-time data $(t_1, y_1), \ldots , (t_n, y_n)$. To estimate $f = (f_1, \ldots , f_p): [a, b] \rightarrow \mathbb {R}^p$, we consider a penalized regression splines with total variation penalty.

Let $\mathcal {S}$ be the set of splines of order m defined on [a, b] with increasing knots sequence $\xi = \{ \xi _1, \ldots , \xi _K \}$ where $a< \xi _1< \cdots< \xi _K < b$. Any spline $s \in \mathcal {S}$ can be expressed as a linear combination of B-spline basis $B_1, \ldots , B_J$ with dimension $J = K + m$.

Denote the B-spline curve

$$s(\cdot ;\beta ) = (s^1(\cdot ; \beta ^1), \ldots , s^p(\cdot ; \beta ^p)) = \sum _{j = 1}^J \beta _j B_j(\cdot ),$$

where $\beta ^\ell = (\beta _1^\ell , \ldots , \beta _J^\ell )$, $\beta _j = (\beta _j^1, \ldots , \beta _j^p) \in \mathbb {R}^p$ and $\beta = [\beta _j^\ell ] \in \mathbb {R}^{J \times p}$ is the coefficient matrix for $j = 1, \ldots , J$ and $\ell = 1, \ldots , p$. Note that the $\ell$th coordinate function of B-spline curve $s^\ell$ is determined by $\beta ^\ell$ such that

$$s^\ell (\cdot ;\beta ^\ell ) = \sum _{j = 1}^J \beta _j^\ell B_j(\cdot ) \quad \text{ for }\quad \ell = 1, \ldots , p.$$

In the B-spline curve, the coefficients matrix $\beta$ serve as control points, which play a key role in determining the shape and position of the curve^39,40. These control points provide local influence over specific parts of the curve and allow for precise adjustments to the curve’s shape by moving the control points.

Penalized curve estimation via group total variation norm

The residual sum of squares is defined as

$$R(\beta ) = \frac{1}{2} \sum _{i = 1}^n {\Vert y_i - s(t_i;\beta ) \Vert }_2^2,$$

where ${\Vert \cdot \Vert }_2$ be a euclidean norm of vector.

For the penalties on the proposed estimators, we adopt the total variation of the $(m - 1)$th derivative of $s^\ell$ for $\ell = 1, \ldots , p$, which corresponds to the coefficient of the highest-degree term in $s^\ell$ within the B-spline scheme. This derivative is a piecewise constant function that changes only at the knot positions, where each jump size corresponds to the degree of change at each knot. Thus, knots with zero jump sizes are naturally regarded as unnecessary and can be removed. The total variation penalty can be described by the jump matrix $D \in \mathbb {R}^{(J - m) \times J}$ that represents the jump sizes of the $(m - 1)$th derivative of the B-splines at the interior knots. For more details on the total variation penalty and the corresponding jump matrix, see⁴¹.

Observe

$$D\beta = \begin{bmatrix}D\beta ^1&\cdots&D\beta ^p \end{bmatrix}\in \mathbb {R}^{(J - m) \times p}.$$

Each column of the matrix $D\beta$ represents the jump sizes at the $(J - m)$ interior knots for each coordinate-specific B-splines $s^\ell$, and their $L_1$ norm corresponds to the total variation. In this study, we additionally propose a group-type total variation $L_2$ norm as a mechanism to ensure that all components select the same set of knots. Thus, the proposed penalty function is defined by

$$p(\beta ) = \sqrt{p}\sum _{j = 1}^{J - m} {\Vert d^j\beta \Vert }_2,$$

where $d^j$ is the jth row of D and $d^j\beta = (d^j \beta ^1, \cdots , d^j \beta ^p) \in \mathbb {R}^p$ is represents the jump sizes at the jth knot for each component. By introducing a group penalty term on this vector, we can ensure homogeneity at the knot level, thereby enhancing both the accuracy and interpretability of the estimator.

To this end, the penalized objective function we optimized is denoted as

$$R^\lambda (\beta ) = R(\beta ) + \lambda p(\beta ),$$

where $\lambda \ge 0$ plays the role of smoothing parameter. Define the estimator of the coefficients matrix $\beta$

$$\hat{\beta }= \mathop {\textrm{argmin}}\limits _{\beta \in \mathbb {R}^{J \times p}} R^\lambda (\beta )$$

and finally the penalized B-spline curve estimator (PBCE) is given by

$$\hat{f}(\cdot ) = s(\cdot ; \hat{\beta }).$$

Optimization approach with ADMM

We utilize the Alternating Direction Method of Multipliers (ADMM) algorithm to address a constrained convex optimization problem. The ADMM algorithm has gained significant attention in statistical and machine learning applications due to its efficiency in solving complex optimization problems. By decomposing large-scale problems into a series of smaller, more manageable sub-problems, ADMM enables efficient, parallelized computations. This iterative approach alternates between updating variables, which allows for distributed optimization and simplifies the handling of complex constraints. Consequently, ADMM has become a powerful tool for optimizing tasks in high-dimensional settings, especially where computational resources are limited or distributed systems are needed. For a more in-depth understanding of the ADMM algorithm, please refer to the works of^42,43,44.

We find the minimization problem is re-formulated as

$$\min _\beta \frac{1}{2} {\Vert Y - B\beta \Vert }_2^2 + \lambda {\Vert D\beta \Vert }_G \quad \text{ subject } \text{ to }\quad D\beta = \alpha ,$$

where $Y = \begin{bmatrix}y_i^\ell \end{bmatrix}\in \mathbb {R}^{n \times p}$ is the response matrix, $B = \begin{bmatrix}B_j(t_i) \end{bmatrix}\in \mathbb {R}^{n \times J}$ is the B-spline basis matrix. Also, for matrix $A \in \mathbb {R}^{r \times c}$, ${\Vert A \Vert }_2$ be the $L_2$ norm of vectorized of matrix A and ${\Vert A \Vert }_G$ denotes the row-wise $L_2$ group norm

$${\Vert A \Vert }_G = \sqrt{c} \sum _{j = 1}^r {\Vert A^j \Vert }_2,$$

where $A^j$ is the jth row of A.

The augmented Lagrangian is defined as

$$L_\eta (\beta , \alpha ) = \frac{1}{2}{\Vert Y - B\beta \Vert }_2^2 + \lambda {\Vert \alpha \Vert }_G + \frac{\eta }{2} {\Vert D\beta - \alpha + u \Vert }_2^2,$$

where $u = \begin{bmatrix}u_j^\ell \end{bmatrix}\in \mathbb {R}^{(J - m) \times p}$. For update scheme of $\beta$, we have

$$\beta \leftarrow \left( B^\top B + \eta D^\top D \right) ^{-1} \left( B^\top Y + \eta D^\top (\alpha - u) \right)$$

For update scheme of $\alpha$, we have

$$\begin{aligned} \alpha&\leftarrow \mathop {\textrm{argmin}}\limits _\alpha \left( \frac{\eta }{2}{\Vert D\beta - \alpha + u \Vert }_2^2 + \lambda \sqrt{p} \sum _{j = 1}^{J - m} {\Vert \alpha ^j \Vert }_2 \right) \\&= \mathop {\textrm{argmin}}\limits _\alpha \left( \frac{1}{2} {\Vert D\beta - \alpha + u \Vert }_2^2 + \frac{\lambda \sqrt{p}}{\eta } \sum _{j = 1}^{J - m} {\Vert \alpha ^j \Vert }_2 \right) \\&= \left( {{\bf {prox}}}_{\frac{\lambda \sqrt{p}}{\eta }}^1 \left( d^1\beta + u^1\right) , \ldots , {{\bf {prox}}}_{\frac{\lambda \sqrt{p}}{\eta }}^{J - m} \left( d^{J - m}\beta + u^{J - m}\right) \right) \\&\triangleq {{\bf {prox}}}_{\frac{\lambda \sqrt{p}}{\eta }} \left( D\beta + u\right) , \end{aligned}$$

where

$${{\bf {prox}}}_\lambda ^j (A) = A^j \cdot \max \left( 1 - \frac{\lambda }{{\Vert A^j \Vert }_2}, 0 \right)$$

is the proximal operator of group norm for the jth row vector $A^j$ of the matrix A^45,46. The procedure for implementing the proposed method is outlined in Algorithm 1. The algorithm was packaged as an R program, and it is provided along with a manual file detailing its main functions, simulation, and data analysis code as supplementary materials.

Data analysis

In this study, we selected drifter data and crane migration data as case studies for analyzing movement trajectories over time. Both datasets share a common characteristic in that they incorporate temporal and spatial variations, forming curved movement trajectories. Drifter data capture movement influenced by ocean currents, while crane migration data reflect seasonal variations in movement patterns. Thus, both datasets serve as suitable cases for analyzing irregular movement. Accordingly, we applied a penalized B-spline curve fitting technique to model the temporal and spatial changes in both datasets. This approach provides a methodological framework for effectively analyzing various movement patterns and can be utilized to interpret movement trajectories in datasets that incorporate temporal variations. The computational algorithm developed in this study has been implemented as an R package, which, along with a detailed manual, has been uploaded to the authors’ GitHub repository (https://github.com/JaeHwan-Jhong/BsplineCurve). In addition, all data and code used for the analyses and visualizations presented in this paper have been organized into R Markdown and PDF files and made publicly available for unrestricted use.

Real time drifter data

We present a novel application of penalized B-spline curve fitting to real oceanographic data, focusing on the Niño 4 region in the central equatorial Pacific Ocean. Utilizing data from the Global Drifter Program(GDP), we demonstrate how this method addresses challenges inherent to drifter trajectory analysis, such as noise and irregular sampling intervals.

Data collection

The GDP provides real-time drifter data that are essential for understanding ocean surface currents and sea surface temperature patterns⁴⁷. Drifters are autonomous, floating instruments deployed across the world’s oceans to collect data that contribute to a variety of scientific research, including studies on climate, ocean circulation, and marine ecosystems. Each drifter is equipped with sensors that transmit location and environmental measurements via satellite, offering high-resolution data that are crucial for oceanographic models and climate forecasts. This real-time drifter dataset supports a wide range of research applications, from validating ocean current models to improving our understanding of air-sea interactions, which are vital for assessing climate change impacts. For more information, see at NOAA’s Global Drifter Program website⁴⁷.

We obtained the drifter data from the NOAA Atlantic Oceanographic and Meteorological Laboratory’s ERDDAP server. The data span from April 2, 2020, to May 10, 2020, covering a geographical region defined by $5^\circ$N-$5^\circ$S latitude and $160^\circ$W-$150^\circ$W longitude, which corresponds to the Niño 4 area in the central equatorial Pacific Ocean. The dataset includes six-hourly measurements of longitude and latitude for drifters with World Meteorological Organization (WMO) identifiers 5102764, 5102765, and 5102766. These specific drifters were selected to illustrate an example of a single estimated generalized trajectory based on data starting from the same location. For improved readability, throughout the following sections we refer to drifter tracks 5102764, 5102765, and 5102766 as track 1, track 2, and track 3, respectively.

The Niño 4 region is pivotal for studying ENSO dynamics. Variations in sea surface temperatures and currents here influence global weather patterns, marine biodiversity, and socio-economic conditions. Understanding these processes is essential for predicting climate anomalies such as El Niño and La Niña events.

The first subplot of Fig. 1 illustrates the global map with the Niño 4 region, marked by black dots in the central equatorial Pacific Ocean, spanning $5^\circ$N–$5^\circ$S latitude and $160^\circ$W–$150^\circ$W longitude. This subplot provides geographic context for the study area, emphasizing the region’s importance in understanding equatorial oceanic dynamics.

The subsequent subplot provides an enlarged view of the Niño 4 region, where the observed trajectories of three drifters are displayed, color-coded as red (track 1), blue (track 2), and black (track 3). Track 1, shown in red, exhibited complex patterns with pronounced rotations, suggesting localized variability in ocean currents. The trajectory of track 2, represented in blue, followed a smoother path but displayed noticeable curvature, likely caused by specific localized interactions. In contrast, track 3, depicted in black, maintained a largely monotonic trajectory, aligning closely with the dominant regional currents.

Together, these visualizations highlight the diverse oceanic dynamics within the Niño 4 region. The interplay between localized variability and broader regional influences is evident in the distinct movement patterns of the drifters, underlining the significance of detailed trajectory analysis in understanding equatorial current systems.

Previous studies have revealed interactions between ocean currents and various oceanographic elements such as the atmosphere, wave, and swell wave^48,49,50. Particularly, swell waves and wind have been shown to significantly impact ocean currents^51,52,53,54. Based on these findings, we aimed to enhance the interpretation of ocean current movement by presenting information on wind wave, primary swell wave, and secondary swell wave. These three datasets were referenced from the CMEMS Marine Data Store⁵⁵.

Figure 2 is a detailed visualization of oceanographic data collected during the early, middle, and late periods of buoy observations on dates such as “April 5, 2020”, “April 21, 2020”, and “May 9, 2020.” In this figure, the spatial distribution of variables such as wave height and swell height is represented using the same longitude and latitude coordinates as the buoy trajectories. Additionally, for each dataset–wind wave, primary swell wave, and secondary swell wave–arrows indicate their respective directions along with the corresponding ocean current direction at that time. This allows for an intuitive understanding of how these three elements influence the movement of ocean currents. These visual cues not only illustrate the interactions between ocean currents and other oceanographic elements, but they also provide concrete evidence for interpreting ocean current movements based on the observed data.

Model settings

We employed a penalized B-spline approach to reconstruct smooth drifter trajectories while preserving essential movement patterns. A key feature of this method is the incorporation of group penalties, ensuring that all drifters share a consistent set of knots. This enhances comparability and facilitates the study of collective behaviors within oceanic systems.

To optimize the balance between flexibility and smoothness, we applied penalty parameters ($\lambda )$ ranging from $10^{-10}$ to 100. Time normalization to the [0, 1] range was performed to ensure numerical stability and a uniform distribution of B-spline basis functions. These settings were tailored to the dataset size (approximately 400 observations), ensuring computational efficiency and accurate representation of trends.

To choose an optimal complexity parameter, we use a Akaike Information Criterion (AIC)⁵⁶ Bayes information criterion (BIC)⁵⁷. The AIC and BIC for a sequence of the complexity parameters is defined as

$$AIC_k = n \log \left\{ \frac{1}{n} \sum _{i = 1}^n \left( y_i - \hat{f}^{\lambda _k} (x_i)\ \right) ^2 \right\} + 2 J_k \quad \text{ and }\quad BIC_k = n \log \left\{ \frac{1}{n} \sum _{i = 1}^n \left( y_i - \hat{f}^{\lambda _k} (x_i)\ \right) ^2 \right\} + J_k \log (n)$$

for $k = 1, \ldots K$, where K is the number of candidate $\lambda$ values and $J_k$ denotes the number of nonzero control points for $\lambda _k$. The optimal value for $\lambda$ is chosen as the lambda value corresponding to the smallest AIC or BIC value. We acknowledge the potential of cross-validation (CV) as an alternative to AIC and BIC. In our initial experiments, we performed an 8:2 data split for cross-validation and observed that the complexity of models selected by CV was generally in line with those chosen using AIC, while BIC tended to select simpler models. Since no substantial difference was found between AIC and CV in terms of selected model complexity, we primarily focused on AIC and BIC for model selection in the manuscript. This tuning parameter selection method was also used in the Demoiselle Crane data analysis.

Results and discussion

Figure 3 illustrates the observed trajectories of three drifters (track 1, track 2, and track 3) within the Niño 4 region and their respective fitted lines. The first subplot (top left) provides an overview of the fitted lines for all three drifters, using the same color scheme as Fig. 1: red for track 1, blue for track 2, and black for track 3. These fitted lines capture both localized variability and broader regional trends in ocean currents.

The subsequent subplots provide a detailed analysis of each drifter’s observed trajectory and its corresponding fitted line. The second subplot (top right) focuses on track 1. Track 1 exhibits a highly intricate path, starting from the lower-right section of the plot and transitioning into sharp turns and pronounced loops near its final segment. These features suggest intense localized variability in ocean currents. The fitted line follows these patterns closely, effectively capturing the sharp changes in direction and rotational movements.

The third subplot (bottom left) examines track 2, whose trajectory presents a smoother curve compared to track 1. However, the path displays a distinct bending towards the southeast, with a noticeable shift in curvature mid-way through its course. This curvature indicates interactions with localized oceanic systems. The fitted line provides a precise representation of the gradual directional changes observed in this trajectory.

Finally, the last subplot (bottom right) focuses on drifter track 3. Track 3 drifter follows a predominantly monotonic and streamlined path with minimal deviations. The trajectory starts with a slight curvature but stabilizes into a nearly linear movement, suggesting alignment with dominant regional currents. The fitted line accurately reflects this uniform and stable progression, reinforcing its alignment with large-scale ocean dynamics.

Figures 4 compare the fitted lines generated using the AIC and BIC, highlighting differences in their ability to capture localized features. The lower-right section of the plots, representing the initial segment of the trajectories, reveals distinct patterns in the behavior of the fitted lines under each criterion. The AIC-optimized model demonstrates a closer fit to the observed data points, closely following minor fluctuations and capturing finer-scale variations in the trajectory. However, this fine-tuned fit introduces a higher risk of overfitting, as seen in the erratic behavior of the fitted line around localized variability. In contrast, the BIC-optimized model provides a smoother trajectory that generalizes the data, sacrificing the ability to follow the finer details in favor of reducing overfitting. This trade-off is evident in the initial segment, where the BIC model fails to capture some localized patterns observed in the data. These differences illustrate the flexibility of the penalized B-spline method in balancing model complexity and generalization, depending on the chosen criterion.

The reconstructed fitted lines underscore the temporal and spatial complexities of ocean currents in the Niño 4 region. In addition, the results demonstrate the method’s capability to enhance the interpretability of drifter data and reveal key insights into ocean surface dynamics, particularly regarding the influence of equatorial currents and trade winds. This evidence, drawn from the detailed analysis of trajectory patterns and the comparison between AIC and BIC optimized models, confirms the robustness of the approach in capturing both large-scale trends and small-scale variations.

Previous statistical methods for estimating ocean currents have comprehensively taken into account key factors such as error analysis, spectral techniques, time series analysis, and wave and tidal modeling^58,59,60. As an extension of these approaches, we propose a novel interpretation of ocean current flows by generalizing buoy trajectories through statistical curve fitting. In particular, our method has the advantage of fitting multivariate data all at once. Moreover, recent technological advances have enabled ocean current measurements using both buoys and radar^61,62. Both methods are considered primary for understanding ocean currents, with radar offering the potential for faster and more accurate interpretations in various regions beyond traditional buoy observations.

Seasonal migration routes of the Demoiselle Crane

We applied a penalized B-spline curve-fitting methodology to analyze the seasonal migration routes of Demoiselle Cranes (Anthropoides virgo). Migration studies provide critical insights into ecological patterns and behavioral adaptations, with Demoiselle Cranes serving as an ideal subject due to their extensive migrations. However, GPS tracking data often present challenges such as irregular intervals, missing values, and noise. To address these issues, our approach combines numerical interpolation with penalized curve-fitting, optimized using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), to estimate smooth migratory trajectories.

Data collection and preprocessing

The simulation utilized GPS tracking data collected between August 2018 and November 2023 from in Mongolia⁶³. To investigate seasonal patterns, data from one year, spanning from August 2018 to August 2019, were selected for analysis. A scatter plot of the raw data, as shown in Fig. 5, illustrates the distribution of GPS points over time.

To account for the irregular nature of GPS data collection, numerical interpolation was performed to adjust for uneven time gaps. The initial recording time was set to $t_1 = 0$, with $t$ increasing by 1 for each hour. Since the dataset covered a full year, $t$ spanned from 0 to 8760 (24 h $\times$ 365 days). These values were rescaled to range between 0 and 365 for analysis purposes.

Model settings

In estimating migration routes, a penalty parameter $\lambda$ was varied between $10^{-5}$ and 10 to balance smoothness and data fidelity. Models were evaluated based on their Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) scores.

Figure 6 shows the trajectory of movement obtained using the BIC and AIC optimization models. The AIC model for the trajectory of movement can describe the complexity and variability of the movement patterns more effectively than the BIC optimization model tends to be simpler. In subsequent analyses, the AIC and BIC fitting results show minimal differences. Therefore, based on the principle of model simplicity, only the BIC fitting results will be presented and used for analysis.

Results and discussion

In this study, we aimed to estimate the migration trajectories of Demoiselle Cranes using the penalized B-spline method. Here, stopover refers to a temporary location where migratory birds rest and forage during long-distance migration. The estimated trajectories were then compared with previously identified migration routes and stopover site patterns to evaluate the accuracy and ecological relevance of our approach.

Despite we using all available data, including those with high levels of noise, our methodology produced migration patterns similar to those observed in previous studies that relied only on a limited set of noise-free observations^38,64. Figure 7 illustrates scatter plots representing stopover sites identified in previous studies for individuals with migration trajectories similar to our fitted line^65,66. This figure demonstrates that many of these stopover sites, identified for individuals with migration trajectories similar to our fitted line, are also included in our estimated migration trajectories.

These findings suggest that our methodology effectively captures the actual migration routes of the cranes. Building on this, we further analyzed migration trajectory dynamics and identified two primary migration groups. The differences in their movement patterns appear to be influenced by environmental factors. Northwest migration group:

Birds in this group predominantly traveled along western routes.

Northeast migration group:

This group followed eastern routes and eventually returned to Mongolia.

This separation between the groups is believed to have been influenced by temperature and wind speed³⁸.

Additionally, further research is needed to determine whether migration patterns are associated with specific climatic conditions and topographical features⁶⁷. Figure 8 visualizes the spatial separation and distinct trajectories of these groups, with scatter plots highlighting the divergence in their northward migration routes.

A comparison of the fitted models indicated that there was little difference between the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)-optimized results. Since AIC tends to overfit models, we selected BIC as a more conservative model selection criterion, as mentioned earlier. Figure 9 presents the migration patterns of the two groups (Left: Northwest Migration Group; Right: Northeast Migration Group) fitted using the BIC criterion These fitted trajectories effectively capture both overall trends and localized variations within each group, demonstrating that the penalized B-spline method is useful for distinguishing specific movement behaviors. This study integrates penalized curve-fitting with statistical modeling, providing a comprehensive approach to analyzing migratory behaviors.

Considering its consistency with previous research, the approach used in this study appears to be applicable to the analysis of movement patterns of other animals, and the estimated migration routes, as demonstrated in our study, can serve as a foundation for further research. Based on these findings, the methodology presented in this study can contribute to the precise estimation of migration routes in the fields of migration ecology and conservation science, and it is expected to serve as a valuable tool for habitat management and conservation planning.

Conclusion

In this study, we presented a robust framework for penalized B-spline curve fitting using the Alternating Direction Method of Multipliers (ADMM) algorithm. The proposed methodology effectively balances flexibility and smoothness by incorporating a total variation penalty, making it highly suitable for applications in diverse scientific domains. Through applications to oceanographic drifter data and ecological bird migration data, we demonstrated the method’s ability to handle noise, irregular sampling intervals, and complex patterns in spatiotemporal data.

While the methodology was developed for a general p-dimensional setting, where represents the number of response variables, the data analyses in this study were limited to $p = 2$, representing two-dimensional spatial data (latitude and longitude). A significant avenue for future research lies in extending this framework to higher-dimensional data settings, such as or more. This expansion would enable the analysis of datasets where an additional dimension, such as altitude, temperature, or other environmental factors, plays a critical role.

Although we considered two-dimensional spatial data (latitude and longitude) with a response variable count of 2, for comparison with other well-known methods we first fitted each case using $(p = 1$) and then combined the results. In Table 1, we compare the MSE, MAE, and MXDV for both the Drifter and Crane datasets using the SplineCurve, SplineCurve_Ind, SmoothSpline²⁴, generalized additive model(GAM)²⁷, and kernel smoothing(K-Smooth)²⁸ methods. As mentioned earlier, SplineCurve_Ind is employed for comparison in the same manner as the other methods. For objective comparisons, we used the optimal parameter values, determined through cross-validation, for the SmoothSpline, GAM, and K-Smooth methods.

For the Drifter data, which comprises approximately 400 data points, the performance metrics (MSE, MAE, and MXDV) of all five methods were quite similar. However, for the Crane data, which contains about 18,000 data points, differences in the evaluation metrics became apparent. Specifically, the GAM method exhibited a relatively high MSE of about 15, indicating a decline in performance, while the SplineCurve, SmoothSpline, and K-Smooth methods yielded similar MSE values of around 11. Notably, the SplineCurve_Ind method, which uses individual fits, achieved the best performance with an MSE of about 5.7.

When comparing our SplineCurve method in the $(p = 2$) case with well-known models such as SmoothSpline, GAM, and K-Smooth, we observed that their performance was comparable. In particular, when evaluated under the $(p = 1$) setting used by the established models, our method demonstrated numerically superior performance. This confirms that our model is highly competitive compared to other approaches.

Table 1 Performance comparison for drifter and crane datasets using SplineCurve, SplineCurve_Ind, SmoothSpline, GAM and K-Smooth methods. SplineCurve_Ind represents the result of applying the SplineCurve method to univariate data. The mean and standard deviation (values in parentheses) of the MSE, MAE, and MXDV from 50 random sampling trials are presented.

Full size table

One promising direction for future work is the application of this methodology to three-dimensional trajectory data, where $p = 3$. For instance, the analysis of bird flight paths, incorporating altitude as a third dimension alongside latitude and longitude, could provide deeper insights into migration dynamics and behavioral adaptations. Similarly, marine studies involving the 3D movement of aquatic animals, such as diving behaviors of whales or sharks, would greatly benefit from this extended approach. These datasets often contain rich, multidimensional information that could further validate and refine the proposed penalized B-spline framework.

Moreover, the ability to handle dimensions could open avenues for analyzing complex environmental or biomedical datasets. For example, incorporating additional dimensions such as time-varying physiological metrics in animal studies or multidimensional climate data in oceanographic research would enhance the scope and applicability of this method.

In summary, this study underscores the versatility and robustness of penalized B-spline curve fitting for analyzing spatiotemporal data. The results demonstrate the method’s effectiveness in capturing both localized variations and broader trends, providing valuable insights into ecological and oceanographic processes. By extending the framework to higher-dimensional settings, future research can unlock new possibilities for addressing complex scientific questions, making this methodology a cornerstone for multidimensional data analysis in the years to come.

Data availability

This real-time drifter dataset supports a wide range of research applications, from validating ocean current models to improving our understanding of air-sea interactions, which are vital for assessing climate change impacts. For more information, see https://www.aoml.noaa.gov/phod/gdp/real-time_data.php. The simulation utilized GPS tracking data collected between August 2018 and November 2023 from Mongolia which can be obtained by the following referenced paper: Yanco, S. W. et al. Migratory birds modulate niche tradeoffs in rhythm with seasons and life history. Proc. Natl. Acad. Sci. 121, e2316827121 (2024).

References

Boor, C. D. A Practical Guide to Splines (Springer, 1978).
Book Google Scholar
Eilers, P. H. C. & Marx, B. D. Flexible smoothing with b-splines and penalties. Stat. Sci. (1996).
Trevor Hastie, R. T. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009).
Book Google Scholar
Schumaker, L. L. Spline Functions: Basic Theory (Cambridge University Press, 2007).
Book Google Scholar
Wahba, G. Spline Models for Observational Data (SIAM, 1990).
Book Google Scholar
David Ruppert, M. P. W. & Carroll, R. J. Semiparametric Regression (Cambridge University Press, 2003).
Book Google Scholar
Green, P. J. & Silverman, B. W. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach (CRC Press, 1994).
Book Google Scholar
Wood, S. N. Generalized Additive Models: An Introduction with R (CRC Press, 2006).
Book Google Scholar
Gu, C. Smoothing Spline ANOVA Models (Springer, 2013).
Book Google Scholar
Leonid I. Rudin, S. O. & Fatemi, E. Nonlinear total variation based noise removal algorithms. Physica D (1992).
Chambolle, A. & Lions, P.-L. Image recovery via total variation minimization and related problems. Numer. Math. (1997).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) (1996).
Noah Simon, T. H., Jerome Friedman & Tibshirani, R. Regularization paths for Cox’s proportional hazards model via coordinate descent. J. Stat. Softw. (2011).
Wood, S. N. Generalized Additive Models: An Introduction with R 2nd edn. (CRC Press, 2017).
Book Google Scholar
Tibshirani, R. J. & Taylor, J. The solution path of the generalized lasso. Ann. Stat. (2011).
Arnold, T. B. & Tibshirani, R. Efficient implementations of the generalized lasso dual path algorithm. J. Comput. Graph. Stat. (2014).
Boyd, S. et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3, 1–122 (2011).
Article Google Scholar
Glowinski, R. & Tallec, P. L. Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics (SIAM, 1989).
Book Google Scholar
Jerome Friedman, H. H., Trevor Hastie & Tibshirani, R. Pathwise coordinate optimization. Ann. Appl. Stat. (2007).
Sihan Zhou, J. L. & Wasserman, L. Time varying undirected graphs. Mach. Learn. (2010).
Yuan, M. & Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) (2006).
Laurent Jacob, G. O. & Vert, J.-P. Group lasso with overlap and graph lasso. In Proceedings of the 26th International Conference on Machine Learning (ICML) (2009).
Pablo Sprechmann, A. M. B. & Sapiro, G. Learning efficient sparse and low rank models. In IEEE Transactions on Pattern Analysis and Machine Intelligence (2011).
Wahba, G. Spline Models for Observational Data (SIAM, 1990).
Book Google Scholar
Kim, S.-J., Koh, K., Boyd, S. & Gorinevsky, D. $\backslash$ell_1 trend filtering. SIAM Rev. 51, 339–360 (2009).
Article MathSciNet Google Scholar
Tibshirani, R. J. Adaptive piecewise polynomial estimation via trend filtering. Ann. Stat. 42, 285–323. https://doi.org/10.1214/13-AOS1189 (2014).
Article MathSciNet Google Scholar
Hastie, T. J. Generalized additive models. In Statistical models in S. 249–307 (Routledge, 2017).
Wand, M. P. & Jones, M. C. Kernel Smoothing (CRC Press, 1994).
Book Google Scholar
Riser, S. C. & Johnson, K. S. Net production of oxygen in the subtropical ocean. Nature 451, 323–325 (2008).
Article CAS PubMed Google Scholar
Lumpkin, R. & Pazos, M. Measuring surface currents with surface velocity program drifters: The instrument, its data, and some recent results. Lagrangian Anal. Predict. Coastal Ocean Dyn. 39 (2007).
Chelton, D. B., Schlax, M. G., Samelson, R. M. & de Szoeke, R. A. Global observations of large oceanic eddies. Geophysi. Res. Lett. 34 (2007).
Maximenko, N., Hafner, J. & Niiler, P. Pathways of marine debris derived from trajectories of Lagrangian drifters. Mar. Pollut. Bull. 65, 51–62 (2012).
Article CAS PubMed Google Scholar
Newton, I. The Migration Ecology of Birds (Academic Press, 2008).
Google Scholar
Robinson, C. J., Bohannan, B. J. & Young, V. B. From structure to function: The ecology of host-associated microbial communities. Microbiol. Mol. Biol. Rev. 74, 453–476 (2010).
Article CAS PubMed PubMed Central Google Scholar
Fiedler, W. Bird ecology as an indicator of climate and global change. In Climate Change. 181–195 (Elsevier, 2009).
Wikelski, M. et al. Going wild: What a global small-animal tracking system could do for experimental biologists. J. Exp. Biol. 210, 181–186 (2007).
Article PubMed Google Scholar
Runge, C. A. et al. Protected areas and global conservation of migratory birds. Science 350, 1255–1258 (2015).
Article CAS PubMed Google Scholar
Galtbalt, B. et al. Differences in on-ground and aloft conditions explain seasonally different migration paths in demoiselle crane. Mov. Ecol. 10, 4. https://doi.org/10.1186/s40462-022-00302-z (2022).
Article PubMed PubMed Central Google Scholar
Piegl, L. & Tiller, W. The NURBS Book (Springer, 1997).
Book Google Scholar
Farin, G. Curves and Surfaces for CAGD: A Practical Guide 5th edn. (Morgan Kaufmann, 2002).
Google Scholar
Jhong, J. H., Koo, J. Y. & Lee, S. W. Penalized b-spline estimator for regression functions using total variation penalty. J. Stat. Plan. Inference 184, 77–93. https://doi.org/10.1016/j.jspi.2016.12.003 (2017).
Article MathSciNet Google Scholar
Wahlberg, B., Boyd, S., Annergren, M. & Wang, Y. An ADMM algorithm for a class of total variation regularized estimation problems. IFAC Proc. Vol. 45, 83–88 (2012).
Article Google Scholar
Xu, Y., Wu, Y. & Yin, W. A unified alternating direction method of multipliers by majorization minimization. IEEE Trans. Pattern Anal. Mach. Intell. 40, 527–541 (2017).
Google Scholar
Gu, Y., Fan, J., Kong, L., Ma, S. & Zou, H. Admm for high-dimensional sparse penalized quantile regression. Technometrics 60, 319–331 (2018).
Article MathSciNet Google Scholar
Parikh, N. & Boyd, S. Proximal algorithms. Found. Trends Optim. 1, 127–239 (2014).
Article Google Scholar
Jenatton, R., Mairal, J., Obozinski, G. & Bach, F. Proximal methods for hierarchical sparse coding. J. Mach. Learn. Res. 12, 2297–2334 (2011).
MathSciNet Google Scholar
NOAA/AOML. Real-time data from global drifter program (2024). Accessed 11 Dec 2024.
Ardhuin, F., Marié, L., Rascle, N., Forget, P. & Roland, A. Observation and estimation of Lagrangian, stokes, and Eulerian currents induced by wind and waves at the sea surface. J. Phys. Oceanogr. 39, 2820–2838 (2009).
Article Google Scholar
Rijnsdorp, D. P. et al. Including the effect of depth-uniform ambient currents on waves in a non-hydrostatic wave-flow model. Coast. Eng. 187, 104420 (2024).
Article Google Scholar
Doney, S. C. et al. Climate change impacts on marine ecosystems. Annu. Rev. Mar. Sci. 4, 11–37 (2012).
Article Google Scholar
Pizzo, N., Deike, L. & Ayet, A. How does the wind generate waves?. Phys. Today 74, 38–43 (2021).
Article Google Scholar
Pathirana, S., Young, I. & Meucci, A. Modelling swell propagation across the pacific. Front. Mar. Sci. 10, 1187473 (2023).
Article Google Scholar
Wu, L., Sahlée, E., Nilsson, E. & Rutgersson, A. A review of surface swell waves and their role in air-sea interactions. Ocean Model. 102397 (2024).
Perrie, W., Tang, C., Hu, Y. & DeTracy, B. The impact of waves on surface currents. J. Phys. Oceanogr. 33, 2126–2140 (2003).
Article Google Scholar
Copernicus Marine Environment Monitoring Service. Global Ocean Physics Analysis and Forecast (2025). Accessed 19 Feb 2025.
Akaike, H. Akaike’s information criterion. Int. Encycl. Stat. Sci. 25–25 (2011).
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 461–464 (1978).
Rossby, T. Visualizing and quantifying oceanic motion. Annu. Rev. Mar. Sci. 8, 35–57 (2016).
Article CAS Google Scholar
Thomson, R. E. & Emery, W. J. Data Analysis Methods in Physical Oceanography (Newnes, 2014).
Google Scholar
Cochran, J. K., Bokuniewicz, H. J. & Yager, P. L. Encyclopedia of Ocean Sciences (Academic Press, 2019).
Google Scholar
Paduan, J. D. & Washburn, L. High-frequency radar observations of ocean surface currents. Annu. Rev. Mar. Sci. 5, 115–136 (2013).
Article Google Scholar
Donlon, C. et al. The global ocean data assimilation experiment high-resolution sea surface temperature pilot project. Bull. Am. Meteorol. Soc. 88, 1197–1214 (2007).
Article Google Scholar
Yanco, S. W. et al. Migratory birds modulate niche tradeoffs in rhythm with seasons and life history. Proc. Natl. Acad. Sci. 121, e2316827121 (2024).
Article CAS PubMed PubMed Central Google Scholar
Millington, S. & Batbayar, N. Unraveling the mystery of demoiselle crane migration. In Technical Report. Vol. 48(1). (International Crane Foundation, 2022).
Kanai, Y. et al. Migration of demoiselle cranes in Asia based on satellite tracking and fieldwork. Glob. Environ. Res. 4, 143–153 (2000).
Google Scholar
Ilyashenko, E. I. et al. Migrations of the demoiselle crane (anthropoides virgo, gruiformes): Remote tracking along flyways and at wintering grounds. Biol. Bull. 49, 863–888. https://doi.org/10.1134/S1062359022070068 (2023).
Article Google Scholar
Bishop, C. et al. The roller coaster flight strategy of bar-headed geese conserves energy during Himalayan migrations. Science 347, 250–254. https://doi.org/10.1126/science.1258732 (2015).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by Chungbuk National University NUDP program (2024). The work of Kwan-Young Bak was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (RS-2024-00342014 and RS-2022-00165581). The work of Hee-Jung Jee was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (RS-2024-00354881 and RS-2024-00440787). The work of R. Jisung Park was supported by the Wharton Data Analytics Initiative. The work of Ja-Yong Koo was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (RS-2023-00253020 and RS-2023-00219212). The work of Jae-Hwan Jhong was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (RS-2024-00342014 and RS-2024-00440787).

Funding

Chungbuk National University NUDP program (2024). The work of Kwan-Young Bak and various authors was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government (MSIT). Details of specific funding sources are RS-2024-00342014, RS-2022-00165581, RS-2024-00354881, RS-2024-00440787, RS-2023-00253020, RS-2023-00219212. The work of R. Jisung Park was supported by the Wharton Data Analytics Initiative.

Author information

Dong-Young Lee and Ju-Seong Lee contributed equally to this work.

Authors and Affiliations

School of Mathematics, Statistics and Data Science, Sungshin Women’s University, Seoul, 02844, Korea
Kwan-Young Bak
Data Science Center, Sungshin Women’s University, Seoul, 02844, Korea
Kwan-Young Bak
Department of Information Statistics, ChungBuk National University, Chungbuk, 28644, Korea
Dong-Young Lee, Ju-Seong Lee, Hee-Jung Jee & Jae-Hwan Jhong
School of Social Policy and Practice, Wharton School of Business, University of Pennsylvania, Pennsylvania, 19104, USA
R. Jisung Park
Department of Statistics, Korea University, Seoul, 02841, Korea
Ja-Yong Koo

Authors

Kwan-Young Bak
View author publications
Search author on:PubMed Google Scholar
Dong-Young Lee
View author publications
Search author on:PubMed Google Scholar
Ju-Seong Lee
View author publications
Search author on:PubMed Google Scholar
Hee-Jung Jee
View author publications
Search author on:PubMed Google Scholar
R. Jisung Park
View author publications
Search author on:PubMed Google Scholar
Ja-Yong Koo
View author publications
Search author on:PubMed Google Scholar
Jae-Hwan Jhong
View author publications
Search author on:PubMed Google Scholar

Contributions

K.W. conducted the literature review and contributed to the writing of the introduction. D.Y. oversaw the analysis of the drifter data, while J.S. led the analysis of the cranes data. H.J. contributed to data preprocessing and comparative experiments. J.H. implemented the algorithm code. J.P. contributes to the exploration of research trends and the interpretation of environmental factors in the data domain. J.Y. contributes to establishing theoretical hypotheses for model setting. K.W. and J.H. provided the methodological ideas and contributed to the manuscript preparation.

Corresponding author

Correspondence to Jae-Hwan Jhong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Bak, KY., Lee, DY., Lee, JS. et al. Efficient curve fitting with penalized B-splines for oceanographic and ecological applications. Sci Rep 15, 21958 (2025). https://doi.org/10.1038/s41598-025-05779-3

Download citation

Received: 26 December 2024
Accepted: 04 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-05779-3