Data-driven approach to the deep learning of the dynamics of a non-integrable Hamiltonian system

Doria Rosales, Elizabeth; Carbone, Vincenzo; Lepreti, Fabio

doi:10.1038/s41598-025-03607-2

Download PDF

Article
Open access
Published: 02 July 2025

Data-driven approach to the deep learning of the dynamics of a non-integrable Hamiltonian system

Elizabeth Doria Rosales^1,2^na1,
Vincenzo Carbone^2,3^na1 &
Fabio Lepreti^2,3^na1

Scientific Reports volume 15, Article number: 23412 (2025) Cite this article

895 Accesses
Metrics details

Subjects

Abstract

The dynamics of non-integrable Hamiltonian systems, described by area-preserving mappings, are regulated by the KAM theorem. This states that the phase space of the system is made up of interwoven sets of regular and chaotic dynamics, whose extent depends on a chaoticity parameter k. The chaoticity parameter measures the degree of non-integrability of the Hamiltonian; the extent of regular orbits decreases as the non-integrable contribution increases. Deep learning is proving increasingly useful in predicting natural phenomena from a set of data, even for chaotic time series forecasting. In this paper we investigate numerical simulations of the standard map with different values of k, as a learning process of a typical non-integrable Hamiltonian system. Our aim is to investigate to what extent a deep learning process is able to recognize the degree of non-integrability of a Hamiltonian system, namely to forecast the actual value of the chaoticity parameter, using data obtained from the same system used in the learning process. Results show that, in general, forecasting the chaoticity parameter is far from being guaranteed, because the KAM theorem is at work. However, the accuracy of the forecasting process depends on both the number of initial conditions and the length of the trajectories used in the learning process. The maximum of the forecasting accuracy is obtained for intermediate values of k, when the phase space is formed by roughly equally spaced regular and irregular trajectories. On the contrary, both relatively low values of k (prevalence of regular orbits) and high values of k (prevalence of irregular orbits), are more difficult to predict. According to our results, a standard deep learning process has difficulty distinguishing between regular and slightly irregular dynamics, and between a purely stochastic system and a system with residual regular orbits.

Reliable uncertainty estimates in deep learning with efficient Metropolis-Hastings algorithms

Article Open access 17 March 2026

Practical Hamiltonian learning with unitary dynamics and Gibbs states

Article Open access 08 January 2024

Robustly learning the Hamiltonian dynamics of a superconducting quantum processor

Article Open access 06 November 2024

Introduction

Chaotic systems exhibit sensitive dependence on initial conditions, meaning that small changes in starting conditions can lead to vastly different outcomes over time. Understanding such systems requires sophisticated mathematical tools and computational techniques that enable researchers to analyze complex behaviors, identify underlying patterns, and make predictions about system evolution over time. Lorenz’s foundational work on deterministic nonperiodic flow provided insights into such behavior¹.

The dynamical behavior of the phase space of Hamiltonian systems can be described in general by two-dimensional area-preserving mappings, so that numerical simulations can easily be performed. Such a dynamical behavior can be described in terms of action-angle variables

$$\begin{aligned} J_{n+1}= & J_n + \beta \; f(J_{n+1},\theta _n) \end{aligned}$$

(1)

$$\begin{aligned} \theta _{n+1}= & \theta _n + 2 \pi g(J_{n+1},\theta _n) + \beta \; h(J_{n+1},\theta _n) \end{aligned}$$

(2)

where f, g and h are nonlinear functions of the parameters, the pair $(\theta _n, J_n)$ represents a point in the action-angle phase-space at time $t_n$, and $\beta$ represents a perturbation parameter. For many interesting applications the functions f and g are assumed to be independent of the action variable and the angle variable, respectively, that is, $f= f(\theta _n)$, $g = g(J_{n+1})$, while $h \equiv 0$. By linearizing equation (2) around a fixed point of period 1, say $J_0 = J_{n+1} = J_n$, for a trajectory close to the actual action $J_n = J_0 + \Delta J_n$ the new action $p_n = 2 \pi g^\prime \Delta J_n$ ($g^\prime$ is the derivative of g) satisfies the so-called standard map^2,3.

$$\begin{aligned} p_{n+1}= & p_n + k \; \sin \theta _n \nonumber \\ \theta _{n+1}= & \left( \theta _n + p_{n+1}\right) \mod 2\pi \end{aligned}$$

(3)

where $k = 2\pi g^\prime \beta f_{max}$ is the chaoticity parameter, and $f_{max}$ is the maximal value of the function f, which, as usual, has been chosen as a sinusoidal function $f/f_{max} = \sin \theta _n$⁴. When $k = 0$, the system (3) describes an integrable Hamiltonian system, while when k is different from zero the system becomes non-integrable.

The Standard Map models various phenomena, such as a kicked pendulum in the absence of friction and gravity, or the interaction of a charged particle with an electrostatic wave^5,6. It belongs to the family of twist maps and serves as a foundational model for both classical and quantum chaos^5,6, as it describes the onset of chaotic behavior near the separatrix of non-integrable systems. Additionally, the Standard Map exhibits the key properties of two-dimensional symplectic mappings, making it a prototype for studying such systems. The dynamics of the map in phase space are governed by the KAM (Kolmogorov–Arnold–Moser) theorem⁴, which addresses the persistence of regular, quasi-periodic motion under small perturbations of an integrable system. For instance, when an integrable Hamiltonian system is slightly perturbed, such as by $k \ll 1$ in the map (3), resonances between degrees of freedom may disrupt the convergence of power series expansions. As a result, near the hyperbolic fixed points of the system, the presence of homoclinic and heteroclinic points ensures a region of chaotic motion, bounded by KAM surfaces. However, regular structures characterized by irrational frequency ratios associated with the action variables persist, even at higher values of the parameter k. For a fixed k, the phase space consists of interwoven regions of regular and irregular dynamics, with the extent of regular behavior diminishing as k increases. To illustrate this, in Figure 1, we show the classical phase space $(\theta _n,p_n)$ of the Standard Map for different values of k, using $N = 160$ random initial values $(\theta _0, p_0)$ within the range $[0, 2\pi ]$ and performing $L = 2048$ iterations. This figure captures the dynamics described by the KAM theorem. The colors in the figure simply represent different trajectories within the phase space.

Over recent years, machine learning (ML) techniques have emerged as powerful tools for forecasting and characterizing chaotic systems, often providing predictive capabilities that surpass those of traditional mathematical approaches^7,8. Among these techniques, deep neural networks (DNNs), particularly artificial neural networks (ANNs), have become state-of-the-art for modeling chaotic time series. Deep learning algorithms can automatically learn hierarchical representations of data, enabling them to identify patterns at various levels of abstraction without explicit guidance from human programmers^9,10,11. Various DNN architectures, including feed-forward neural networks (FFNs) and recurrent neural networks (RNNs), have been applied to chaotic systems with differing levels of success. While FFNs focus on learning static input-output relationships, RNNs and their advanced forms, such as reservoir computing models^12,13 and long short-term memory (LSTM) networks¹⁴, excel in capturing temporal dependencies, making them particularly suitable for time series prediction in chaotic dynamics.

The scope of ML applications in chaotic systems has broadened significantly, introducing new tools for classification, forecasting, and parameter inference. In the context of regression and forecasting, Boullé et al. (2020)¹⁵ and Celletti et al. (2022)¹⁶ demonstrate that deep learning can effectively classify chaotic and regular dynamics in complex systems, overcoming the limitations of conventional methods in high-dimensional systems. Similarly, Lee and Flach (2020)¹⁷ explore the application of deep learning techniques to classify chaotic and regular dynamics in dynamical systems, particularly in the two-dimensional Chirikov Standard Map. Using a convolutional neural network, the authors show that the model can effectively identify chaotic behavior even over short trajectories, where traditional numerical methods (such as the Lyapunov exponent) can fail to converge. The study highlights the neural network’s robustness across varying control parameters and its success in testing other discrete dynamical systems, including the one-dimensional logistic map and a discrete version of the three-dimensional Lorenz system. This approach provides a promising alternative for rapid chaos classification in complex systems.

When it comes to regression and forecasting tasks, Sangiorgio et al. (2022) demonstrate how deep learning models can effectively predict the multi-step evolution of chaotic systems, addressing the challenges posed by noise and real-world unpredictability in chaotic environments¹⁸. This work complements that of Pathak et al. (2017), who use machine learning to replicate chaotic attractors and calculate Lyapunov exponents, showing that data-driven models can capture the underlying chaotic structure without relying on exact system equations¹². Similarly, Duncan et al. (2023) introduce a hybrid reservoir computing model that combines data-driven and model-based approaches, optimizing predictions in cases where partial system knowledge exists, thereby enhancing both model accuracy and robustness¹³. The studies by Kavuran et al. (2022) and Maathuis et al. (2017) further emphasize the role of machine learning in chaotic time series analysis. Kavuran et al. (2022) demonstrate the use of bidirectional LSTM (BiLSTM) for detecting structural variations in fractional-order chaotic systems, underscoring the flexibility of machine learning in identifying subtle dynamical changes, which is essential for applications in secure communications and encryption¹⁴. Maathuis et al. (2017) highlight the suitability of neural networks for forecasting chaotic time series, proposing these models as viable alternatives to traditional approaches, particularly in high-dimensional, nonlinear dynamic systems¹⁹.

Many complex physical phenomena are described by chaotic Hamiltonian models. However, the limited scope of experimental data available for real chaotic systems often reduces confidence in the theoretical feasibility of predicting key system properties with a finite dataset. Despite these limitations, in time series analysis, deep learning models trained on multiple short time series have been observed to outperform traditional methods, which typically require long time series to achieve similar results^12,17,20. This enhanced performance stems from the ability of deep learning models to ’learn’ the statistical properties of the system from a large set of short time series. However, this advantage relies on specific conditions: ergodicity, where statistical properties remain consistent over time, and stationarity, where these properties remain unchanged. In non-ergodic or non-stationary systems, where statistical properties vary, the ensemble averaging approach often used by deep learning models may not hold. In such cases, advanced techniques like Finite Time Stability Exponents (FTSEs) and Finite Time Lyapunov Exponents (FTLEs), which capture the system’s dynamic structure over finite time windows, can provide more reliable insights and improve predictability^{21,22,32,33,34}.

Suppose we examine the dynamics of a generic system described by an energy-based model, where phase-space evolution is governed by the standard map (3) with an unknown parameter k. By using the map’s dynamics across varying k values as training data for a DNN model, we aim to assess the model’s ability to estimate the unknown parameter k using only a limited number, L, of map iterations and N trajectories generated with that parameter. Unlike previous studies focusing solely on classifying trajectories or forecasting time series, this study addresses the following questions: How accurately can a DNN predict the chaoticity parameter k in a two-dimensional symplectic map? How does prediction accuracy depend on the number of iterations, L, and the initial conditions? What insights into the structure of chaotic phase space can be gained from the representations learned by the neural network?

By addressing these questions, we aim to contribute to the growing body of research on ML applications in chaos theory, demonstrating that neural networks can be utilized not only for short-term forecasting but also for the structural characterization of chaotic systems through parameter inference.

Methods

Data generation

In this study, we used 100 distinct values of k, spanning the range $0 \le k \le 5$ with a step size of $\Delta k = 0.05$. We generated a total of 169 initial conditions $(p_0, \theta _0)$, obtained by combining 13 discrete values of $p_0$ and 13 discrete values of $\theta _0$ within the interval $[0, 2\pi ]$, using a step size of $\Delta = 0.5$. From this full set, we considered subsets of up to $N = 160$ initial conditions, and for each initial condition, we performed a maximum of $L = 2048$ iterations for every value of k.

To train our deep learning model, we constructed multiple datasets by extracting different subsets from this pool of initial conditions, using the following procedure. For each value of k, we randomly selected a subset of N initial conditions, where N takes values in the set 1, 10, 20, 30,..., 160, from the original set of 169. Each selected initial condition was evolved for a number of iterations, chosen from one of five possible values of L (128, 256, 512, 1024, 2048). This process was performed independently for each k. As a result, datasets such as $(L = 128, N = 100)$ and $(L = 256, N = 100)$, while containing the same number of trajectories, do not necessarily contain the same set of initial conditions due to the stochastic selection process applied independently to each one.

The trajectories were divided based on the value of k into three disjoint subsets: a training set ($70\%$) for model fitting, a validation set ($15\%$) for hyperparameter tuning, and a test set ($15\%$) for evaluating the model’s generalization performance. For consistency, the assignment of k values to each set remained fixed across all pairs of N and L. In other words, if a particular k value was assigned to the training, validation, or test set for one specific pair of N and L, it retained that assignment across all other pairs.

The ability of a trained deep learning model to generalize to new, unseen samples depends on the similarity of the distributions in the training, validation, and test sets. Ensuring that these subsets are representative of the overall data distribution is crucial for optimal model performance. To achieve this, we employed a random partitioning strategy to divide the dataset into the the training, validation and test sets, ensuring that each set accurately reflected the overall distribution.

Deep learning model

Time series data consists of observations or measurements collected sequentially over time and is found in various domains such as finance, economics, weather forecasting, signal processing, and healthcare, among others. Unlike standard classification and regression tasks, time series problems introduce the complexity of temporal dependence between observations, necessitating specialized handling during model fitting and evaluation. However, this temporal structure can also be advantageous, offering additional insights such as trends and seasonality that can enhance model performance. Deep learning algorithms, particularly Deep Neural Networks (DNNs), have emerged as a promising approach to address the challenges of analyzing and modeling time series data. Their ability to capture complex temporal patterns and dependencies has led to their increasing use in time series analysis^23,24,25.

DNNs map inputs to targets through a series of simple transformations learned from examples (pairs of inputs and targets). These transformations are parameterized by the network’s weights, so training a DNN involves finding the optimal set of weight values to accurately map inputs to their corresponding targets. Since DNNs often contain millions of parameters, this optimization task is challenging, as adjusting one weight can influence the behavior of the entire network.

To guide the training process, a loss function measures the difference between the predicted and true target values. This function computes a score that reflects the network’s performance on a given example. The score serves as feedback to adjust the weights, pushing them in the direction that reduces the loss for that example. Initially, the weights are assigned random values, leading to poor predictions and high loss scores. However, through iterative processing, the weights gradually adjust to minimize the loss function, allowing the network to make increasingly accurate predictions.

In this study, we apply these principles within the framework of a convolutional neural network (CNN), a specialized type of neural network designed to process data with a grid-like topology, such as images. Initially developed for tasks like handwritten digit recognition²⁶, CNNs have also been applied to time series analysis, as time series data can be treated as a one-dimensional grid of samples taken at regular intervals. The core operation of a CNN is convolution, which involves summing each element of the input with its neighbors, weighted by a kernel. As the kernel slides across the input, it generates a feature map, which is passed to subsequent layers. This process enables the network to develop hierarchical representations, from basic patterns in shallow layers to more complex structures in deeper layers. This hierarchical feature extraction enables CNNs to achieve high accuracy across a range of tasks, including image recognition²⁷, object detection²⁸, and natural language processing²⁹.

While CNNs can process time series data, sequence modeling benefits from a different approach to handle temporal dependencies. For this reason, we next consider recurrent neural networks (RNNs), specialized neural networks designed specifically for sequence modeling. RNNs maintain an internal memory state, which is a condensed representation of past information, continuously updated with new observations at each time step. However, training RNNs on tasks that involve long-term dependencies can be challenging due to issues like vanishing and exploding gradients³⁰. To address these challenges, long short-term memory networks (LSTMs) were developed³¹. LSTM models enhance gradient flow by introducing a cell state that serves as a memory unit for storing long-term information. This cell state is regulated by a set of gates: the input gate, forget gate, and output gate. The input gate controls how new information is integrated into the cell state, while the forget gate determines which information from the previous state should be discarded. The output gate regulates the flow of information either to the next time step or as the final output of the network. By controlling the information flow through these gates, LSTMs can capture and retain long-term dependencies in sequential data, overcoming the limitations of traditional RNNs.

We apply these models to a multivariate time series of length L, represented by two components, p and q, which are generated from Equation 3. In the context of the CNN, these components will be referred to as ‘channels’, while in the LSTM context, they will be referred to as ‘features’. The model’s primary objective is to predict the parameter k, which governs the system dynamics. For each dataset instance, a distinct DNN was trained, validated, and tested. The architectures of the two models, a CNN and an LSTM, are depicted in Figures 2 and 3, respectively. Detailed information about the architectures can be found in the corresponding figures.

To assess predictive performance and model robustness, we trained each model using three loss functions: Mean Squared Error (MSE), Mean Absolute Error (MAE), and SmoothL1. The Mean Absolute Error (MAE), also referred to as L1 loss (Eq. (4)), computes the average of the absolute differences between the true and the predicted values of k ($k_{\text {true}}$ and $k_{\text {pred}}$, respectively). Since large errors contribute linearly to this loss, MAE is less sensitive to outliers than MSE, promoting a more uniform error distribution across the dataset.

$$\begin{aligned} MAE = \frac{1}{N} \sum _{i=1}^{N} |k_{\text {true}}(i) - k_{\text {pred}}(i)| \end{aligned}$$

(4)

The Mean Squared Error (MSE), also known as L2 loss (Eq. (5)), measures the average of the squared differences between the true and the predicted $k$ values. Because the squaring operation magnifies large errors, MSE is more sensitive to outliers. This property encourages the model to prioritize reducing larger errors, which may be beneficial in applications requiring high precision in predictions.

$$\begin{aligned} MSE = \frac{1}{N} \sum _{i=1}^{N} (k_{\text {true}}(i) - k_{\text {pred}}(i))^2 \end{aligned}$$

(5)

SmoothL1 loss, a combination of L1 and L2 losses, strikes a balance between MSE and MAE (Eq. (6)). For small errors, SmoothL1 behaves similarly to MSE by squaring the differences, whereas for larger errors, it switches to an L1 form, using only the absolute difference. This hybrid approach moderates the impact of outliers while maintaining an overall focus on error reduction, making it more robust to extreme deviations.

$$\begin{aligned} \text {SmoothL1} = \frac{1}{N} \sum _{i=1}^{N} {\left\{ \begin{array}{ll} 0.5(k_{\text {true}}(i) - k_{\text {pred}}(i))^2 & \text {if } |k_{\text {true}}(i) - k_{\text {pred}}(i)| < 1 \\ |k_{\text {true}}(i) - k_{\text {pred}}(i)| - 0.5 & \text {otherwise} \end{array}\right. } \end{aligned}$$

(6)

We used the Adam optimizer for training with a learning rate of 0.001, while all other hyperparameters were maintained at their default settings. Each model was trained for 100 epochs. Despite variations in the dataset instances used, the neural network architectures remained consistent, enabling a comprehensive comparison of their performance across different datasets.

Results and discussion

As expected, the results for forecasting the parameter k depend on both the number of initial conditions N and the trajectory length L. Figure 4 displays the phase space for $k=1.2$ as presented to the deep learning model during training. Although the density of trajectories varies, the main structures within the phase space remain preserved. This section presents the outcomes of the LSTM model using the SmoothL1 loss function, as outlined in Section “Deep learning model”. Comparative analysis of the CNN and LSTM models, as well as the effects of different loss functions, is provided in Appendix.

After training for a fixed pair (N, L), we use the reserved values of k in the forecasting phase, as outlined in Section “Data generation”, to obtain the predicted values $k_{pred}$ from our model. Figure 5 presents the Probability Density Functions (PDFs) of $\log (k_{\text {pred}}/k_{\text {true}})$ for various combinations of $(k, N)$, each trained on trajectories of different lengths. Figure 6 illustrates the PDFs of $\log (k_{\text {pred}}/k_{\text {true}})$ for different combinations of $(k, L)$, each trained on a variable number of trajectories. These PDFs were estimated using Kernel Density Estimation (KDE) with a Gaussian kernel and normalized to unit area.

Narrower PDFs around $k_{\text {pred}} = k_{\text {true}}$ indicate improved accuracy of the model’s predictions. At first glance, the data-driven approach for predicting parameters in Hamiltonian chaotic systems, represented here by the standard map, shows only limited success. Generally, the predictability of the chaotic parameter $k$ is influenced by the map’s dynamics, governed by the KAM theorem, in which the density of regular trajectories varies with $k$. Thus, we find that the forecast accuracy for $k$ depends on the parameter’s values, given fixed $N$ and $L$. Examining the figures, we observe that for lower values of $k$, where regular trajectories cover a larger portion of phase space, the PDFs remain broad, particularly for smaller $L$ and $N$ values. However, as $k$, $L$, and $N$ increase, the PDFs become narrower, suggesting improved prediction accuracy. This trend aligns with expectations, as data-driven deep learning models tend to perform better with larger datasets. Additionally, the differences in predictability across various $k$ values are non-trivial and in some cases counterintuitive.

In Fig. 7, we report the mean values and standard deviations of $\log (k_{\text {pred}}/k_{\text {true}})$ as functions of $k_{\text {true}}$ used in forecasting across different combinations of $(N, L)$. Our results confirm that higher values of $N$ and $L$ enhance the system’s predictability. A particularly interesting and counterintuitive finding is that, for a fixed $(N, L)$, there is a trend indicating that the PDFs become narrower as $k$ increases. This suggests that predictability improves at higher $k$ values, even when the PDFs occasionally peak away from $k_{\text {pred}} = k_{\text {true}}$. However, the best results are achieved for intermediate values of $k$.

In Fig. 8, we display the dispersion of $k_{\text {pred}}$ around the corresponding $k_{\text {true}}$ values on the $(k_{\text {true}}, k_{\text {pred}})$ plane, for selected values of $(N, L)$. This figure reveals that the minimum dispersion tends to occur at intermediate $k$ values. In other words, the deep learning model appears to perform best when the system’s phase space is balanced between extreme conditions, avoiding overly expansive or constrained regions of chaotic and regular trajectories. This finding confirms our earlier result,indicating that prediction accuracy depends more on the phase space structure than solely on the values of $N$ and $L$.

Conclusion

In this study, we applied deep neural networks (DNNs) to analyze the standard map, a model for non-integrable Hamiltonian systems governed by the KAM theorem, to evaluate its effectiveness in predicting the chaoticity parameter $k$. Our goal was to explore whether deep learning can reliably estimate the chaoticity parameter, which represents the degree of chaos, by training on variations in initial conditions and trajectory lengths. The results underscore that prediction accuracy is highly dependent on both the phase-space structure, controlled by $k$, and the density of trajectories in the phase space. Specifically, prediction accuracy improves as the phase space becomes more evenly populated with both regular and chaotic trajectories, particularly for intermediate values, where regular and irregular structures are balanced.

For lower values, with a dominance of regular trajectories, the DNN model showed limited predictive ability, likely due to the relatively uniform coverage of phase space by regular trajectories. In contrast, high values of $k$ presented another challenge, while dominated by chaotic trajectories, these regions also retained persistent stable structures, such as stable manifolds and regular islands, within the predominantly chaotic regions of phase space. These structures are a distinctive feature of mixed-phase space systems, governed by the KAM theorem, where regular and chaotic dynamics coexist. Even in regions where chaos predominates, these stable structures remain embedded and can influence the dynamics, making it more difficult for the DNN to make accurate predictions. These ‘hidden’ regularities are not necessarily obvious or widespread, but their presence creates a complex topology that hinders the model’s ability to learn and predict accurately. However, for intermediate values of $k$, the model achieved its best performance, suggesting that deep learning is most effective when the phase space is neither entirely regular nor entirely chaotic.

Further analysis, as presented in the Appendix, provided deeper insights into the model’s behavior concerning chaotic and non-chaotic trajectories. The results indicate that predictions based solely on non-chaotic trajectories are significantly more accurate than those for chaotic trajectories, emphasizing the model’s sensitivity to regular patterns.

Additionally, the comparative analysis of convolutional neural networks (CNNs) and long short-term memory (LSTM) models revealed distinct advantages. The LSTM model demonstrated superior performance in predicting chaotic trajectories compared to the CNN model, which exhibited higher loss values for chaotic paths. This suggests that LSTMs, which can capture temporal dependencies, are better suited for handling chaotic systems. Furthermore, the results confirmed that increasing the number of initial conditions (N) and the trajectory length (L) enhances the model’s predictive accuracy, though improvements plateau beyond a certain threshold.

Loss function analysis further supported these findings, with SmoothL1 outperforming Mean Squared Error (MSE) and Mean Absolute Error (MAE) across different trajectory lengths and models. This underscores the importance of selecting appropriate loss functions when training deep learning models for chaotic systems.

Ultimately, this research demonstrates that while DNNs have significant potential in chaotic system analysis, their efficacy in parameter inference is subject to inherent system complexities as defined by the KAM theorem. Future research can leverage these insights to refine deep learning models for better adaptability in systems with varying chaotic and regular dynamics.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

Lorenz, E. N. Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130–141 (1963).
Article MathSciNet Google Scholar
Chirikov, B. V. Research concerning the theory of non-linear resonance and stochasticity. Tech. Rep., CM-P00100691 (1971).
Chirikov, B. V. A universal instability of many-dimensional oscillator systems. Phys. Rep. 52, 263–379 (1979).
Article MathSciNet Google Scholar
Lichtenberg, A. & Lieberman, M. Regular and Chaotic Dynamics (Springer, 1992).
Book Google Scholar
Sprott, J. Chaos and time-series analysis (2003).
Ott, E. Chaos in Dynamical Systems (Cambridge University Press, 2002).
Book Google Scholar
Tang, Y., Kurths, J., Lin, W., Ott, E. & Kocarev, L. Introduction to focus issue: When machine learning meets complex systems: Networks, chaos, and nonlinear dynamics. Chaos Interdiscipl. J. Nonlinear Sci. 30, 1–10 (2020).
MathSciNet Google Scholar
Ramadevi, B. & Bingi, K. Chaotic time series forecasting approaches using machine learning techniques: A review. Symmetry 14, 955 (2022).
Article CAS Google Scholar
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT press, 2016).
Google Scholar
Chollet, F. Deep Learning with Python (Simon and Schuster, 2021).
Google Scholar
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 61, 85–117 (2015).
Article PubMed Google Scholar
Pathak, J., Lu, Z., Hunt, B. R., Girvan, M. & Ott, E. Using machine learning to replicate chaotic attractors and calculate lyapunov exponents from data. Chaos Interdiscipl. J. Nonlinear Sci. 27, 1–10 (2017).
MathSciNet Google Scholar
Duncan, D. & Räth, C. Optimizing the combination of data-driven and model-based elements in hybrid reservoir computing. Chaos Interdiscipl. J. Nonlinear Sci. 33, 1–10 (2023).
MathSciNet Google Scholar
Kavuran, G. When machine learning meets fractional-order chaotic signals: Detecting dynamical variations. Chaos Solitons Fract. 157, 111908 (2022).
Article Google Scholar
Boullé, N., Dallas, V., Nakatsukasa, Y. & Samaddar, D. Classification of chaotic time series with deep learning. Physica D 403, 132261 (2020).
Article MathSciNet Google Scholar
Celletti, A., Gales, C., Rodriguez-Fernandez, V. & Vasile, M. Classification of regular and chaotic motions in Hamiltonian systems with deep learning. Sci. Rep. 12, 1890 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lee, W. S. & Flach, S. Deep learning of chaos classification. Mach. Learn. Sci. Technol. 1, 045019 (2020).
Article Google Scholar
Sangiorgio, M. Deep learning in multi-step forecasting of chaotic dynamics. Special Topics in Information Technology, 3–14 (2022).
Maathuis, H., Boulogne, L., Wiering, M. & Sterk, A. Predicting chaotic time series using machine learning techniques. In Preproceedings of the 29th Benelux Conference on Artificial Intelligence (BNAIC 2017), 326–340 (University of Groningen: SPO, 2017).
Corbetta, A., Menkovski, V., Benzi, R. & Toschi, F. Deep learning velocity signals allow quantifying turbulence intensity. Sci. Adv. 7, 7281 (2021).
Article Google Scholar
Tomsovic, S. & Lakshminarayan, A. Fluctuations of finite-time stability exponents in the standard map and the detection of small islands. Phys. Rev. E 76, 036207 (2007).
Article MathSciNet Google Scholar
Da Silva, R., Manchein, C., Beims, M. & Altmann, E. Characterizing weak chaos using time series of lyapunov exponents. Phys. Rev. E 91, 062907 (2015).
Article MathSciNet Google Scholar
Han, Z., Zhao, J., Leung, H., Ma, K. F. & Wang, W. A review of deep learning models for time series prediction. IEEE Sens. J. 21, 7833–7848 (2019).
Article Google Scholar
Lim, B. & Zohren, S. Time-series forecasting with deep learning: A survey. Phil. Trans. R. Soc. A 379, 20200209 (2021).
Article MathSciNet PubMed Google Scholar
Mohammadi Foumani, N. et al. Deep learning for time series classification and extrinsic regression: A current survey. ACM Comput. Surv. 56, 1–45 (2024).
Article Google Scholar
LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst.2 (1989).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).
Google Scholar
Ren, S., He, K., Girshick, R. & Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016).
Article PubMed Google Scholar
Collobert, R. et al. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011).
Google Scholar
Bengio, Y., Simard, P. & Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994).
Article CAS PubMed Google Scholar
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Article CAS PubMed Google Scholar
Grassberger, P. et al. Scaling laws for invariant measures on hyperbolic and nonhyperbolic atractors. J. Statistical Phy., 51(1–2), 135–178, https://doi.org/10.1007/BF01015324 (1988).
Abarbanel, H. D.I. et al. Variation of Lyapunov exponents on a strange attractor. J. Nonlinear Sci., 1(2), 175–199, https://doi.org/10.1007/BF01209065 (1991).
Abarbanel, H. D.I. et al. Local Lyapunov exponents computed from observed data. J. Nonlinear Sci., 2(3), 343–365, https://doi.org/10.1007/BF01208929 (1992).

Download references

Acknowledgements

This article was produced while attending the PhD program in in Space Science and Technology at the University of Trento, Cycle XXXVIII, with the support of a scholarship financed by the Ministerial Decree no. 351 of 9th April 2022, based on the NRRP - funded by the European Union - NextGenerationEU - Mission 4 "Education and Research", Component 1 "Enhancement of the offer of educational services: from nurseries to universities” - Investment 4.1 “Extension of the number of research doctorates and innovative doctorates for public administration and cultural heritage" - CUP E63C22001340001. We acknowledge two anonymous Referees whose comments deeply improved the final version of the paper. This study was carried out within the Space It Up project funded by the Italian Space Agency, ASI, and the Ministry of University and Research, MUR, under contract n. 2024-5-E.0 - CUP n. I53D24000060005.

Author information

These authors contributed equally:Elizabeth Doria Rosales, Vincenzo Carbone and Fabio Lepreti.

Authors and Affiliations

Department of Physics, University of Trento, Via Sommarive, Povo, 38123, Trento, Italy
Elizabeth Doria Rosales
Physics Department, University of Calabria, Ponte P. Bucci Cubo 31C, Rende, 87036, Cosenza, Italy
Elizabeth Doria Rosales, Vincenzo Carbone & Fabio Lepreti
National Institute for Astrophysics, Scientific Directorate, Viale del Parco Mellini 84, 00136, Roma, RM, Italy
Vincenzo Carbone & Fabio Lepreti

Authors

Elizabeth Doria Rosales
View author publications
Search author on:PubMed Google Scholar
Vincenzo Carbone
View author publications
Search author on:PubMed Google Scholar
Fabio Lepreti
View author publications
Search author on:PubMed Google Scholar

Contributions

V.C. and F.L. conceived the analysis, E.D.R. conceived the DNN and conducted the numerical experiments. All authors analysed and discussed the results and reviewed the manuscript.

Corresponding author

Correspondence to Elizabeth Doria Rosales.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information. (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Doria Rosales, E., Carbone, V. & Lepreti, F. Data-driven approach to the deep learning of the dynamics of a non-integrable Hamiltonian system. Sci Rep 15, 23412 (2025). https://doi.org/10.1038/s41598-025-03607-2

Download citation

Received: 18 July 2024
Accepted: 21 May 2025
Published: 02 July 2025
Version of record: 02 July 2025
DOI: https://doi.org/10.1038/s41598-025-03607-2