Multi-omic network inference from time-series data

Moscardó García, María; Aalto, Atte; Montanari, Arthur N.; Skupin, Alexander; Gonçalves, Jorge

doi:10.1038/s41540-025-00591-1

Download PDF

Article
Open access
Published: 14 October 2025

Multi-omic network inference from time-series data

María Moscardó García¹,
Atte Aalto¹,
Arthur N. Montanari^1,2,3,
Alexander Skupin^1,4,5 &
…
Jorge Gonçalves^1,6

npj Systems Biology and Applications volume 11, Article number: 114 (2025) Cite this article

Subjects

Abstract

Biological phenotypes emerge from complex interactions across molecular layers. Yet, data-driven approaches to infer these regulatory networks have primarily focused on single-omic studies, overlooking inter-layer regulatory relationships. To address these limitations, we developed MINIE, a computational method that integrates multi-omic data from bulk metabolomics and single-cell transcriptomics through a Bayesian regression approach that explicitly models the timescale separation between molecular layers. We validate the method on both simulated datasets and experimental Parkinson’s disease data. MINIE exhibits accurate and robust predictive performance across and within omic layers, including curated multi-omic networks and the lac operon. Benchmarking demonstrated significant improvements over state-of-the-art methods while ranking among the top performers in comprehensive single-cell network inference analysis. The integration of regulatory dynamics across molecular layers and temporal scales provides a powerful tool for comprehensive multi-omic network inference.

Recovering biomolecular network dynamics from single-cell omics data requires three time points

Article Open access 27 August 2024

Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data

Article Open access 19 October 2023

SignalingProfiler 2.0 a network-based approach to bridge multi-omics data to phenotypic hallmarks

Article Open access 23 August 2024

Introduction

Recent advances in experimental techniques have revolutionised our capacity to simultaneously acquire high-throughput data from the genome, epigenome, transcriptome, proteome, and metabolome¹. Datasets that encompass multiple omic layers within a biological system are collectively referred to as multi-omic data. Thus far, omic layers within these datasets have often been analysed in isolation^2,3,4,5, offering a partial understanding of the complex machinery of biological systems⁶. The integrative analysis of multi-omic data directly addresses this limitation, providing a holistic perspective of biological processes and cellular functions⁷. Such approaches have been successfully applied to associate omic entities with specific phenotypes⁸, identify biomarkers for drug therapies⁹, stratify patients¹⁰, and conduct functional analyses¹¹. Inferring regulatory networks through multi-omic data can reveal complex genotype–phenotype relationships and uncover regulatory pathways overlooked in single-omic studies.

Biological interactions are commonly represented as regulatory networks, where nodes correspond to biological molecules associated with distinct omics (e.g., genes, proteins, or metabolites) and directed edges indicate causal effects between molecules. Inferring these causal relationships typically requires time-series data to capture the temporal order of events in the system^12,13,14. In computational biology, considerable research has focused on network inference from time-series data¹⁵, with a particular emphasis on gene regulatory networks (GRNs)^{3,16,17,18,19}. Traditionally, bulk RNA sequencing (RNA-seq) data was the primary source of data for GRN inference^3,20,21,22, but lately, single-cell RNA sequencing (scRNA-seq) data has been receiving considerable attention for its ability to capture cellular heterogeneity^{16,17,23,24,25,26}. Yet, these tools primarily focus on single-omic studies and do not consider the challenges associated with the integration of multi-omic data.

Multi-omic data exhibit significant sample heterogeneity and variability, especially when data is measured at a single-cell resolution. Experimental protocols for data collection can be distinct for each omic layer, leading to multiple data modalities¹. More importantly, these layers are regulated at different timescales ranging from seconds to hours²⁷. These characteristics require the development of data-driven methods that can model the system dynamics over a wide range of molecular layers and temporal scales. In this context, several methods have been developed to improve GRN inference by integrating genomic, transcriptomic, and chromatin accessibility data^28,29,30. However, these tools are limited to inferring gene-gene interactions and cannot predict heterogeneous, cross-omic interactions. One notable method that overcomes this limitation, KiMONo³¹, combines statistical models with prior knowledge of protein-protein interactions to infer regulatory networks from multi-omic data. Yet, these predictions strongly rely on human-curated knowledge, which is known to be sparse and incomplete, and the method is not designed for time-series data. Other existing approaches to multi-omic network inference are based on graph representation learning algorithms^32,33. These methods also have limitations: they typically focus on statistical correlations between nodes, which hinder the identification of causal relationships, and are based on pre-specified network topologies that may not accurately represent the true underlying biological network. While tools like TREM-Flux³⁴ and scFEA³⁵ integrate transcriptomic and metabolomic data, their primary goal is metabolic flux estimation rather than inferring causal regulatory relationships between omic layers. Despite these advances, computational methods that can build causal dynamical models from time-series multi-omic data are still needed.

This paper presents MINIE, a computational method for Multi-omIc Network Inference from timE-series data. Our approach follows a two-step pipeline for the inference of inter- and intra-layer interactions. First, we incorporate the timescale separation across omic layers using a model of differential-algebraic equations (DAEs). Second, we integrate the two most common data modalities available in multi-omic datasets—bulk and single-cell measurements—within a Bayesian regression framework, enabling the inference of the network topology. We validate our approach using case studies that integrate single-cell transcriptomic data (slow layer) with bulk metabolomic data (fast layer). These two omics were chosen due to the critical role of metabolites as both end products of gene expression and key regulators of cellular processes. Our results show that MINIE accurately infers regulatory networks using synthetic datasets generated from both linear and nonlinear dynamical models. When applied to experimental data from Parkinson’s disease (PD) studies, MINIE successfully identified high-confidence interactions reported in literature as well as novel links that are potentially relevant to PD (which could then be further validated). In benchmarking against state-of-the-art algorithms, MINIE outperformed single-omic methods, underscoring the importance of purpose-built algorithms for multi-omics integration. Additionally, when compared to existing GRN inference methods designed exclusively for scRNA-seq data, MINIE demonstrated superior performance in curated and synthetic networks. Overall, these findings highlight MINIE’s potential to advance our understanding of complex biological systems through comprehensive multi-omic integration.

Results

Method overview

MINIE is a data-driven network inference tool designed to identify causal interactions both within and across omic layers. Figure 1 summarises the method pipeline. This paper focuses on the regulatory role of the metabolome on the transcriptome. Accordingly, MINIE takes as inputs time-series of both transcriptomic and metabolomic data (Fig. 1a). Given the advances in sequencing technologies, transcriptomic data is now available at the single-cell level (scRNA-seq data), while metabolomic measurements typically remain at the bulk level. MINIE integrates these commonly used data modalities in multi-omic experiments.

A significant challenge for causal inference is the timescale separation in the regulation of different omics³⁶. For instance, the turnover time of the metabolic pool in mammalian cells is approximately one minute, while the mRNA pool half-life is around ten hours²⁷. MINIE captures this phenomenon using a dynamical model of DAEs: the slow transcriptomic dynamics are captured by differential equations that govern the evolution of mRNA concentrations over time, while the fast metabolic dynamics are encoded as algebraic constraints that assume instantaneous equilibration of metabolite concentrations. This allows DAEs to explicitly integrate processes that unfold on vastly different timescales within a single unified model. By contrast, when fast and slow processes coexist, ordinary differential equations (ODEs) require stiff numerical approximations that are unstable and computationally demanding, providing an inaccurate representation of the underlying biological system.

The DAE model is formalised as

$$\begin{array}{ll}\dot{{\boldsymbol{g}}}\,=\,{\boldsymbol{f}}({\boldsymbol{g}},{\boldsymbol{m}},{{\boldsymbol{b}}}_{{\rm{g}}};{\boldsymbol{\theta }})+{\boldsymbol{\rho }}({\boldsymbol{g}},{\boldsymbol{m}}){\boldsymbol{w}},\\ \dot{{\boldsymbol{m}}}\,=\,{\boldsymbol{h}}({\boldsymbol{g}},{\boldsymbol{m}},{{\boldsymbol{b}}}_{{\rm{m}}};{\boldsymbol{\theta }})\approx 0,\end{array}$$

(1)

where ${\boldsymbol{g}}\in {{\mathbb{R}}}_{\ge 0}^{{n}_{{\rm{g}}}}$ denotes a vector containing the expression levels of n_g genes, ${\boldsymbol{m}}\in {{\mathbb{R}}}_{\ge 0}^{{n}_{{\rm{m}}}}$ denotes the concentration levels of n_m metabolites, and n = n_g + n_m is the total number of molecules. The nonlinear functions ${\boldsymbol{f}}:{{\mathbb{R}}}^{n}\mapsto {{\mathbb{R}}}^{{n}_{{\rm{g}}}}$ and ${\boldsymbol{h}}:{{\mathbb{R}}}^{n}\mapsto {{\mathbb{R}}}^{{n}_{{\rm{m}}}}$ describe the multi-layer interactions involved in gene and metabolite regulation, respectively. The algebraic equations arise from the quasi-steady-state approximation $\dot{{\boldsymbol{m}}}(t)\approx 0$, due to the assumption that changes in ${\boldsymbol{m}}(t)$ occur much faster than those of other variables in the system, allowing ${\boldsymbol{m}}(t)$ to be considered effectively constant on the timescales of interest. Stochastic influences observed in biological processes, such as cellular noise, are accounted for using a multiplicative noise model, where ρ(g, m) is a state-dependent function representing the noise amplitude and w is a Gaussian white noise. External influences or baseline effects that are known a priori (e.g., gene knockdown) are represented by ${{\boldsymbol{b}}}_{{\rm{g}}}\in {{\mathbb{R}}}^{{n}_{{\rm{g}}}}$ and ${{\boldsymbol{b}}}_{{\rm{m}}}\in {{\mathbb{R}}}^{{n}_{{\rm{m}}}}$. Finally, we explicitly denote the parameters sought to be identified in our model from data as ${\boldsymbol{\theta }}\in {{\mathbb{R}}}^{{n}_{{\rm{p}}}}$.

The method is divided into two steps described in Fig. 1b: (1) transcriptome–metabolome mapping inference and (2) regulatory network inference via Bayesian regression. The first step is grounded on the algebraic component of Eq. (1). Assuming h can be approximated by a linear function, we have that

$$0\approx {A}_{{\rm{mg}}}{\boldsymbol{g}}+{A}_{{\rm{mm}}}{\boldsymbol{m}}+{{\boldsymbol{b}}}_{{\rm{m}}},$$

(2a)

$${\boldsymbol{m}}\approx -{A}_{{\rm{mm}}}^{-1}{A}_{{\rm{mg}}}{\boldsymbol{g}}-{A}_{{\rm{mm}}}^{-1}{{\boldsymbol{b}}}_{{\rm{m}}},$$

(2b)

where ${A}_{{\rm{mg}}}\in {{\mathbb{R}}}^{{n}_{{\rm{m}}}\times {n}_{{\rm{g}}}}$ and ${A}_{{\rm{mm}}}{{\mathbb{R}}}^{{n}_{{\rm{m}}}\times {n}_{{\rm{g}}}}$ are matrices encoding the gene-metabolite and metabolite-metabolite interactions. Using time-series measurements of metabolite concentrations ${\boldsymbol{m}}(t)$ and gene expression ${\boldsymbol{g}}(t)$ this formulation allows A_mg and A_mm to be inferred through a sparse regression problem, circumventing the underdetermined nature of biological systems (characterised by high-dimensional data and limited sample sizes). To narrow down the number of interactions inferred in A_mm and A_mg, we curated a list of human metabolic reactions documented in the literature³⁷. This data was then used to identify metabolite-metabolite and gene-metabolite interactions, and constrain the nonzero elements in A_mm and A_mg to only those interactions that are known a priori (see Fig. S1 for an illustration of the curated network). Consequently, the structure of these matrices is fixed, and the sparse regression infers only the corresponding interaction strengths. The transcriptome–metabolome mapping $\Gamma =-{A}_{{\rm{mm}}}^{-1}{A}_{{\rm{mg}}}$ is then computed from the inferred A_mm and A_mg and captures the relationship between transcriptomic and metabolomic data.

The second step builds on the differential component of Eq. (1). To characterise the complexity and uncertainty of gene regulatory dynamics, we model f as a Gaussian process (GP). This GP framework describes the temporal evolution of gene expressions (i.e., gene trajectories) through a mean function and a covariance function encoding gene-gene and gene-metabolite relationships. Our approach to estimating the GP functions is based on BINGO³, a GRN inference tool developed for bulk RNA-seq data. This step of MINIE introduces two key modifications:

1.
scRNA-seq data integration via pseudotime estimates;
2.
metabolomic integration via the inferred transcriptome–metabolome mapping Γ: g ↦ m.

First, we introduce the latent variable τ, known as pseudotime, to model cellular progression within the biological system under study. Unlike actual time, pseudotime is inferred from data and represents the relative temporal progression of each measured cell state along a biological process. For example, cell differentiation is a dynamical process where two measured cells (even if sampled at the same time point) might be at different stages in the differentiation process. Pseudotime aims to infer the developmental trajectory followed by those cells. Given the inherent noise in scRNA-seq data, we sample gene trajectories (using the GP model) and identify those that best fit the scattered data points. This procedure ensures consistency between pseudotime estimates, single-cell gene expression data, and the underlying dynamical model.

Second, MINIE uses the mapping Γ to infer the single-cell metabolomic trajectories based on the sampled gene trajectories. These metabolic profiles are treated as external inputs to the GP model, integrating the regulation of metabolites on gene dynamics at the single-cell level. We note that perturbations, such as drugs or mutations, can also be incorporated as external inputs. As a result, the inferred dynamical model can be used to uncover the underlying network structure and identify potential perturbation targets (e.g., drug targets) that modulate gene regulation.

We employ Bayesian inference and three Markov Chain Monte Carlo (MCMC) samplers to infer the posterior distribution of gene trajectories, pseudotime, and model parameters (including the network topology). MINIE builds the posterior distribution p(θ∣G) for the model parameters θ given the measured (transcriptomic) data G. To make the distribution tractable, gene trajectories ${\boldsymbol{g}}(t)$ and pseudotime τ are introduced as latent variables, yielding

$$p({\boldsymbol{\theta }}| G)\propto p({\boldsymbol{\theta }})\iint \,p(G| {\boldsymbol{g}},{\boldsymbol{\tau }},{\boldsymbol{\theta }}\,)p(\,{\boldsymbol{g}}| \theta )p({\boldsymbol{\tau }}){\rm{d}}{\boldsymbol{g}}\,{\rm{d}}{\boldsymbol{\tau }},$$

(3)

where p(g∣θ) describes the gene trajectories ${\boldsymbol{g}}(t)$ over potential network interactions (encoded in θ).

MINIE’s output is a confidence matrix $C\in {[0,1]}^{{n}_{{\rm{g}}}\times n}$, where each entry C_ij corresponds to the probability that molecule j regulates gene i (Fig. 1c). By setting a threshold C_ij ≥ ε, we can infer the underlying multi-omic regulatory network, identifying the specific genes and metabolites involved in gene expression regulation. See “Methods” section for mathematical and algorithmic details on MINIE’s implementation.

Proof-of-concept on a multi-layer network motif

To validate the fundamental design of MINIE, we conducted a proof-of-concept experiment using a multi-layer network motif. Figure 2 shows MINIE’s performance on this case study. The system used to generate synthetic data consists of 5 nodes (3 genes and 2 metabolites) modelled as in Eq. (1), where both f and h are linear functions encoding the network interactions (Fig. 2a). In this case, the DAE model is given by

$$\left[\begin{array}{c}\dot{{\boldsymbol{g}}}(t)\\ 0\end{array}\right]=\left[\begin{array}{cc}{A}_{{\rm{gg}}}&{A}_{{\rm{gm}}}\\ {A}_{{\rm{mg}}}&{A}_{{\rm{mm}}}\end{array}\right]\left[\begin{array}{c}{\boldsymbol{g}}(t)\\ {\boldsymbol{m}}(t)\end{array}\right]+\left[\begin{array}{c}{{\boldsymbol{b}}}_{{\rm{g}}}\\ {{\boldsymbol{b}}}_{{\rm{m}}}\end{array}\right],$$

(4)

where the block matrices have consistent dimensions and encode the intra- and inter-omic interactions. Details on model parameters and data generation can be found in the Supplementary Material, Section 2. MINIE’s performance was statistically evaluated using 100 synthetic datasets with different initial conditions and external inputs (e.g., overexpression or underexpression levels). Each dataset comprises 11 time points sampled under two experimental conditions, in which the concentration of either metabolite 1 or 2 is increased (e.g., b_m = [1, 0] or [0, 1]), mimicking control versus mutant/treated experiments in biological studies. Figure 2b illustrates the synthetic (transcriptomic and metabolomic) data for a representative realisation, together with the fitted pseudotime and metabolic trajectories by MINIE. As expected, given the linearity of the data (and our assumption that the structures of A_mg and A_mm are known), the inferred metabolite trajectories by the mapping Γ are almost perfect.

**Fig. 2: Performance on linear network motifs.**

Figure 2c shows the confidence values predicted by MINIE for gene-gene and metabolite-to-gene interactions. By setting an appropriate threshold, ε = 0.7, the regulatory network can be perfectly reconstructed with high confidence (Fig. 2d). To measure the method’s ability to distinguish between classes (true vs. non-existing links) across all possible thresholds, we used the area under the receiver-operating characteristic curve (AUROC). MINIE achieved an average AUROC of 0.99 over all datasets (Fig. 2e). Overall, these results demonstrate that MINIE reliably infers the network topology in a linear, low-dimensional model.

We emphasise that the identifiability of multi-omic systems is crucial in our study. Identifiability refers to the ability to uniquely estimate model parameters from measurements. Linear DAE models, unlike their ODE counterparts, can be unidentifiable even when all variables are perfectly measured, depending on the network structure³⁸. Two key reasons underlie this lack of identifiability. First, when A_mm is invertible, it follows from Eqs. (2b) and (4) that the DAE can be reduced to the ODE $\dot{{\boldsymbol{g}}}=({A}_{{\rm{gg}}}-{A}_{{\rm{mm}}}^{-1}{A}_{{\rm{mg}}}){\boldsymbol{g}}$. The gene dynamics can thus be equivalently represented either as a DAE or an ODE, making the model non-unique. Yet, DAEs (representing multi-omic networks) are generically sparser than their ODE counterparts (representing single-omic networks). For example, the network in Fig. 2a has 3 edges in the DAE model compared to 4 edges in the ODE model (ignoring self-edges). MINIE favours the identification of DAE models by promoting network sparsity in the inference of the mapping Γ and the covariance matrix of the GP functions. Second, DAEs may display equivalent dynamics at the metabolome layer, hindering the identification of the regulatory metabolites. Breaking this proportionality contributes to the system identification, which, together with the previous assumption, is essential for accurately assessing an algorithm’s predictive capacity when using synthetic networks. See Supplementary Material, Section 3, for further discussion.

Case studies on nonlinear multi-omic models

The linear network motifs are useful for initial validation, but lack the complexity inherent in biological systems. To address this challenge, we consider two nonlinear biological models: a curated multi-omic network described by Hill-type regulatory interactions and the canonical lac operon model.

Curated multi-omic network

Figure 3 shows MINIE’s performance on a nonlinear multi-omic regulatory network curated from the literature. Since experimentally validated ground truths for multi-omic networks remain scarce, we constructed a realistic network based on established biological principles. This design draws from the scientific review³⁹, which focuses on the regulation of metabolism at the transcriptomic level. By integrating the GRNs and metabolic networks described therein (along with reported inter-omic interactions; see Supplementary Material, Section 4 for details), we built a multi-omic network consisting of 9 genes, 8 metabolites, and 26 interactions (Fig. 3a). We used the BoolODE algorithm⁴⁰ to generate biologically relevant multi-omic time-series data under two experimental conditions: mutant and control (Supplementary Material, Section 2). The BoolODE algorithm accounts for the nonlinear dynamics of cellular processes by modelling the saturation of molecular concentrations using the Hill function. We adapted the original code to simulate the molecular regulation under different timescales (i.e., ×75 faster for metabolites) based on comprehensive studies across biological systems^41,42,43; for completeness, a performance comparison under different scaling factors is also reported in Table S1. The synthetic data is represented in Fig. 3b, together with the fitted pseudotime and metabolic trajectories by MINIE (for illustration purposes, only Cholesterol and Ornithine trajectories are shown). Despite the nonlinearity of the data, the inferred transcriptome–metabolome mapping successfully reconstructs the metabolite trajectories, showing a good fit with the original data.

**Fig. 3: MINIE’s performance on nonlinear multi-omic model.**

The confidence scores output by MINIE are indicated in Fig. 3c. The separation between true positive links and non-existing links enables us to set a threshold, ε = 0.4, where eight true positives (out of 10) and one false positive are identified (Fig. 3d). The false positive could be explained by the dynamic similarity between C/EBPα and PPARβ (Fig. S2); this led to a low confidence score of 0.33 for the true link from C/EBPα to HNF4α, which is excluded as a false negative for ε = 0.4. Despite these minor discrepancies, MINIE achieved a strong performance, with an AUROC of 0.93. Notably, the inter-omic link from Ornithine to SREBP1c was correctly inferred with moderate confidence (0.50), which further supports MINIE’s reliability in complex scenarios.

Lac operon model

Figure 4 shows MINIE’s performance on the well-characterised Escherichia coli lac operon. Unlike the previous example, this model provides both a well-known topology and experimentally measured kinetic parameters, enabling more stringent validation. We adopted the delay-differential equation model in ref. ⁴⁴, which describes the coupled dynamics of five molecular species: lactose (L), allolactose (A), mRNA (M) transcribed from the lacZ gene, β-galactosidase (B), and membrane permease (P). These represent fast (metabolites: L and A) and slow (genes/proteins: M, B and P) variables (Fig. 4a), accounting for the timescale separation modelled in MINIE. The simulations were performed in a stochastic Langevin framework under two environmental scenarios: an induction phase with high extracellular lactose and increasing nutrition, and a nutrient-rich repression phase with decreasing extracellular lactose (Fig. 4b).

**Fig. 4: MINIE’s performance on *Escherichia coli* lac operon.**

Despite the system’s strong nonlinearity and time-delay regulation, MINIE accurately inferred the regulatory links, achieving an AUROC of 0.93 and AUPRC of 0.85 (Fig. 4d). Notably, the key inter-omic link from allolactose to mRNA was inferred with high confidence (0.88), demonstrating robust performance under mechanistically realistic conditions.

These two case studies—one inspired by biological principles and the other grounded in mechanistic data—demonstrate MINIE’s ability to infer complex regulatory networks under nonlinear dynamics, timescale separation, and biological variability.

Benchmarking MINIE against published algorithms

Benchmarking new algorithms is essential to characterise their pros and cons over existing methods. However, MINIE is the first network inference method designed specifically for time-series multi-omic data. Hence, we benchmarked MINIE against state-of-the-art GRN inference methods using two strategies to validate MINIE’s new features: (1) considering a synthetic multi-omic dataset for the evaluation of available GRN inference methods (even though they are only designed for single-omic studies and do not account for multiple timescales), and (2) comparing the GRN inference capabilities of MINIE solely on transcriptomic single-cell data using the BEELINE pipeline⁴⁰.

Multi-omic dataset benchmarking

In the first strategy, we used the multi-omic model investigated in Fig. 3. We compared MINIE with BINGO³ and dynGENIE3²⁰, which are the top-performing methods in the DREAM4 in silico GRN inference challenge^3,45,46,47. Since BINGO and dynGENIE3 were designed for bulk time-series data, we consider as input data—for all three methods— average values of molecular concentrations across all cells for each time point. For BINGO and dynGENIE3, we concatenated the gene expression and metabolomic data into a single input matrix (under the premise that these data were part of a single-omic layer). In contrast, MINIE used its two-step pipeline to integrate metabolomic and transcriptomic data. All methods output confidence matrices, which are evaluated using the AUROC and area under the precision-recall curve (AUPRC) metrics. Figure 5 shows that MINIE outperforms both BINGO and dynGENIE3 in terms of AUROC and AUPRC by a margin of 13% and 8%, respectively, with AUROC exceeding 90%. Although all methods face similar challenges, such as misidentifying regulators like C/EBPα and PPARβ, MINIE stands out as the only approach capable of uniquely inferring the regulatory links from metabolites to genes (Fig. S8). A comparison of running times for the three methods is also included in Table S4.

**Fig. 5: Benchmark of MINIE against state-of-the-art GRN inference methods.**

Importantly, these results constitute a de facto ablation study, isolating the functional advantages of MINIE’s design choices. When applied to bulk RNA-seq data, BINGO and MINIE operate under equivalent conditions except for the inclusion of the metabolomic layer in MINIE through the transcriptome–metabolome mapping. The superior performance of MINIE in this setting directly reflects the added value of this mapping and its integration into the network inference step. Furthermore, MINIE achieves even higher predictive accuracy and confidence when applied to single-cell data (Fig. 3d) compared to its bulk counterpart (Fig. S8a), underscoring the informational value of single-cell data for network inference. Together, these results validate the impact of MINIE’s core design elements: the integration of multi-omic data modalities, the modelling of molecular layer dynamics, and the exploitation of single-cell variability.

Single-cell dataset benchmarking

In the second strategy, we benchmarked MINIE’s capacity to infer GRNs from scRNA-seq data using the BEELINE pipeline⁴⁰. The analysis is based on synthetic, curated, and experimental datasets derived from single-cell transcriptomic data, and considers the following performance metrics: the AUPRC ratio, which compares the AUPRC score of a given method to that of a random predictor, and the early-precision ratio (EPR), which measures the fraction of true positives among the top-k predicted edges relative to a random predictor (accounting for the edge density of the ground truth). Figure 6 compares the performance of MINIE with other published methods^{16,17,23,24,48,49,50,51,52,53,54,55} included in the BEELINE study, whereas a comparison of the computational running times is included in Table S5. The synthetic networks were designed to evaluate each method’s ability to infer regulatory networks producing a variety of trajectories observed in differentiating and developing studies. Figure 6a shows that MINIE achieved superior performance in three out of six network motifs (bifurcating, bifurcating converging, and trifurcating) and closely matched the best-performing methods in the remaining ones (linear, cycle, and long linear). Additionally, MINIE’s performance was consistent across different cell numbers (100, 200, 500, 2000, and 5000), with a robust average AUPRC close to 0.8 across multiple network topologies (Fig. S9). Figure 6b shows the benchmark using curated networks (mCAD, VSC, HSC, GSD) derived from validated Boolean models. MINIE achieved competitive AUPRC ratios, particularly excelling in the HSC and GSD networks. Although it performed poorly in the mCAD network (as did most other methods), MINIE stands among the top-performing methods. Notably, our method maintained consistently high performance across both curated and synthetic datasets, in contrast to other methods that typically excelled in one domain but underperformed in the other.

**Fig. 6: Benchmark of MINIE using the BEELINE pipeline.**

Finally, we tested MINIE on experimental scRNA-seq datasets using ground truths reconstructed from several resources containing regulatory information, such as ENCODE⁵⁶, DoRothEA⁵⁷, and STRING⁵⁸. MINIE’s performance, while strong on synthetic and curated datasets, was limited with these experimental data (Fig. S10). This difference raised two hypotheses: (i) biases in the ground-truth networks that favour correlation-based methods or (ii) limitations in MINIE’s scalability to large datasets. However, tests with smaller experimental datasets produced similar trends (Fig. S10; second row), discarding the second hypothesis. Regarding the first hypothesis, the BEELINE study itself demonstrated that GRN inference performance varies substantially depending on whether STRING or ChIP-seq-based networks are used as ground truths, with higher scores observed on STRING despite similar network densities⁴⁰. Likewise, other independent studies have also reported that imposed ground truths may introduce biases and do not always capture direct regulatory interactions^59,60. To further explore this, we applied a traditional correlation-based analysis to the experimental datasets, which achieved notably high performance—particularly when evaluated against nonspecific and functional ground truths (Fig. S10; first row). These findings suggest that the current ground-truth construction may systematically favour correlation-based rather than causal-based approaches, highlighting a potential area for improvement in the benchmarking of causal network inference methods.

Experimental validation on Parkinson’s disease data

We applied the pipeline to study PD⁶¹. The experimental procedure focused on the differentiation of induced Pluripotent Stem Cells (iPSCs) to dopaminergic neurons in cell lines derived from a PD patient with a PINK1 mutation and a control subject. Time-series bulk metabolomics and scRNA-seq data were collected at six stages of differentiation (days 0, 8, 18, 25, 32, and 37) for both healthy and PD cell lines. Figure 7a, b illustrates the dynamics of the experimental data together with the estimated pseudotime and metabolite trajectories generated by MINIE.

**Fig. 7: Multi-omic network inference for the Parkinson’s disease datasets.**

The PINK1 mutation was modelled as an external perturbation to identify genes with dynamic responses to this mutation. We analysed the results across three dimensions (Fig. 7c): predicted gene-gene interactions (left panel), regulatory role of metabolites (middle panel), and perturbation target candidates (right panel). To visualise the reconstructed networks based on MINIE’s predictions, we consider two choices of thresholds: ε = 0.04 (Fig. 7d) and 0.06 (Fig. 7e). At a lower threshold, the reconstruction yields a small-scale network with 63 genes and 6 metabolites, interconnected by 274 links. The higher threshold leads to a single connected component network with 552 molecules and 3669 interactions. The small-scale network is particularly well-suited for biological interpretability, as it highlights only the most confident interactions. On the other hand, the large-scale nature of the latter network demonstrates the comprehensive scope of MINIE’s predictive capacity and offers a complete resource amenable to computational analysis within network medicine and other data science frameworks^62,63,64. Such large-scale networks can enable, for example, the identification of key hubs, pathways, and relationships that may not be immediately apparent in raw data.

In what follows, we focus on the biological interpretation of the regulatory network inferred in Fig. 7d. A direct quantitative assessment of our predictions is not possible due to a lack of ground truth. Notwithstanding, we conducted a comprehensive literature review of reported biological interactions to qualitatively validate our findings, a common practice in network inference studies^65,66,67. Several top-scoring genes predicted as perturbation targets are known to be linked to neurodegeneration, including PD. For instance, ATP5A1, involved in mitochondrial function and mitophagy, is upregulated in models with PINK1 mutations and has been linked to PD pathogenesis⁶⁸. Similarly, RHOA—a key regulator of cytoskeletal dynamics—has been implicated in PD through its role in maintaining neuronal structure function, with dysregulation observed in PD models⁶⁹. Other identified genes (e.g., PHGDH and DNAJC7), though not directly linked to PINK1, have been associated with neuronal health and survival in PD^70,71. We have also identified several high-confidence predictions for gene-gene interactions, including well-known interactions such as H1F0-MAP1B. Other links, like PCNA-EEF1B2, can represent indirect interactions, which are predicted with higher confidence than the true direct regulations (PCNA-EEF1A1-EEF1B2, respectively). Notably, we also identified novel potentially relevant links, including several interactions regulating ZFAS1, a gene known for its role in reducing neuronal damage and inhibiting inflammation and apoptosis, though its specific involvement in PD remains uncertain⁷². Finally, our network also highlights several PD-related hubs, including: NREP, involved in neural regeneration and plasticity; DLK1, a stress-response kinase implicated in dopaminergic neuron degeneration; and PDP1, a regulator of mitochondrial metabolism. These findings show the potential of our method to uncover both established and previously unreported interactions that may be relevant to PD development.

Validating metabolite-gene associations proved challenging due to the limited literature on metabolomic regulation. We therefore qualitatively examined our top predicted metabolites, including Glutathione (GSH/GSSG) and Glutamate, both well established in PD pathophysiology. Glutathione is depleted in early PD, perpetuating oxidative stress, mitochondrial dysfunction, and neuronal death⁷³, while dysregulated glutamate contributes to excitotoxicity and progressive neuronal damage⁷⁴. Although specific predicted metabolite-gene pairs like Glutamate-TAF7 and GSSG-CALM2 are not currently reported in the HMDB or STITCH databases, our results suggest biologically plausible links. In particular, the GSSG-CALM2 association is supported by mechanistic evidence, as glutathione redox status modulates calcium/calmodulin signalling through protein S-glutathionylation, a process strongly implicated in PD-related oxidative stress. These findings underscore the potential of our approach to generate novel, testable hypotheses that extend beyond existing biochemical annotations.

Discussion

This study presented MINIE, a two-step algorithm designed for multi-omic network inference from time-series data. Unlike other state-of-the-art algorithms designed for time-series (e.g., BINGO and dynGENIE3), which focus exclusively on single-omic data, MINIE models both transcriptomic and metabolomic dynamics within a DAE framework. Our results show that, when accounting for datasets with timescales differing by 100-fold or more, single-omic methods struggle to accurately infer regulatory cross-omic interactions, while MINIE successfully predicts these links with high confidence. These findings emphasise the importance of tailoring algorithms to the unique temporal and molecular characteristics of multi-omic data, rather than directly relying on single-omic methods. Furthermore, even in benchmarks comprising solely single-omic data, MINIE attains a highly competitive performance in network inference from scRNA-seq data, both in synthetic and curated data. Such performance stems from our Bayesian framework for pseudotime estimation, designed to model the cell variability intrinsic in single-cell data. MINIE’s performance was validated on both synthetic and experimental datasets. On synthetic data (including linear and nonlinear dynamics), MINIE accurately inferred the underlying network topologies, capturing both intra- and inter-omic interactions, and outperformed other published methods. A qualitative analysis on a PD study revealed that MINIE successfully identified known regulatory interactions (especially gene-gene regulations) while also uncovering novel links with potential relevance to the disease.

Despite these strengths, MINIE has limitations. Its performance depends on the quality and completeness of the input data, as well as prior knowledge of the metabolic network topology. Moreover, our analysis shows that while MINIE is robust across a wide range of cell counts (from a few hundred to thousands), its accuracy can substantially decrease when the number of time points or experimental conditions is limited. These challenges are expected given the high-dimensional nature of the data, and the fact that datasets comprising a single experiment often lack sufficient excitation—that is, the necessary variability in the input conditions to accurately identify system parameters. Moreover, our linear mapping approach, though beneficial for interpretability and computational efficiency, likely oversimplifies nonlinear regulatory mechanisms in metabolite-gene interactions. As we demonstrate in the Supplementary Material, Section 8 and Fig. S7, nonlinear methods for sparse model identification can become computationally intractable and are prone to overfitting in high-dimensional transcriptomic datasets. Future work should therefore focus on developing sophisticated nonlinear mapping methods that preserve biological interpretability while effectively handling the dimensionality challenges inherent to such data. Finally, although prior knowledge typically enhances inference accuracy, it may introduce biases when network structures are incomplete or inaccurate.

With current experiments, MINIE cannot infer the metabolic network. This is due to the lack of identifiability in the algebraic part (i.e., metabolite dynamics) of the DAE model, which directly restricts our ability to reconstruct the metabolic network without imposing additional constraints³⁸. Given this intrinsic limitation, we focus on a critical subproblem: inferring how the metabolome regulates the transcriptome, an area that remains largely unexplored. Fortunately, the metabolome is one of the most well-characterised omic layers, providing a wealth of prior knowledge that can be effectively leveraged for network inference. This prior knowledge is tailored to the experimental conditions through the inferred transcriptome–metabolome mapping, enabling the integration of metabolomic data into our method. Future work could explore more flexible approaches to incorporate prior knowledge, which would further increase MINIE’s adaptability to novel biological contexts. Moreover, MINIE’s reliance on a pseudotime approach may not be ideal for capturing highly heterogeneous behaviour. The presented transcriptome–metabolome mapping could also be integrated in an approach based on modelling the propagation of the full single-cell distributions²⁶. Finally, reliance on GP modelling and Bayesian inference poses scalability challenges for larger datasets. MCMC sampling, while effective, is computationally intensive and may experience slow convergence in high-dimensional settings (see Supplementary Material, Section 6, for experiments and discussion on MINIE’s computational efficiency). Despite these challenges, our results demonstrate that the method can be successfully applied to networks comprising around 500 nodes. In high-dimensional problems, we strongly recommend conducting several MCMC runs in parallel. Parallelisation not only accelerates the inference process but also improves exploration of the parameter and topology spaces.

Looking ahead, several promising directions could extend and refine MINIE. Integrating additional omic layers, such as proteomics and epigenomics, would provide a more holistic understanding of regulatory mechanisms, bringing MINIE closer to reconstructing the full interactome. The proteome, in particular, can be naturally incorporated in the DAE framework as protein translation operates on timescales comparable to mRNA transcription⁴⁰. Additionally, the advent of single-cell proteomics offers a rich dataset that is compatible with our pseudotime approach⁷⁵. To fully capitalise on these developments, there is an urgent need for comprehensive multi-omic benchmarks with validated ground truths to facilitate more rigorous evaluation and refinement of network inference methods.

Methods

MINIE’s algorithm

The pseudocodes describing MINIE’s pipeline are included in the Supplementary Material (Algorithms 1 and 2). A MATLAB implementation of MINIE is available in GitLab (see “Data availability” section). Below, we detail the two main steps implemented in MINIE, namely the transcriptome–metabolome mapping inference and the network inference via Bayesian regression.

Notation

The scRNA-seq data comprises a sequence of samples collected at specific time points, with each sample capturing the gene expression levels of individual cells. For each time point k = 1, …, T, the number of cells measured is denoted by N_k. This dataset is represented by matrix G = [G₁ …G_Y], where each column ${{\boldsymbol{G}}}_{i}\in {{\mathbb{R}}}_{\ge 0}^{{n}_{{\rm{g}}}}$ represents the gene expression levels of cell i. The matrix contains all cells measured over all time points, with the total number of cells being $Y=\mathop{\sum }\nolimits_{k = 1}^{T}{N}_{k}$, and is sorted by time points such that $G=[{{\boldsymbol{G}}}_{1}\,\ldots \,{{\boldsymbol{G}}}_{{N}_{1}}\,{{\boldsymbol{G}}}_{{N}_{1}+1}\,\ldots \,{{\boldsymbol{G}}}_{{N}_{1}+{N}_{2}}\,\ldots \,{{\boldsymbol{G}}}_{Y}]$.

The metabolomic data is represented by $M\in {{\mathbb{R}}}_{\ge 0}^{{n}_{{\rm{m}}}\times T}$, a matrix of observed responses containing n_m experimental metabolite concentrations measured in bulk over T time points.

Transcriptome–metabolome mapping inference

We infer the transcriptome–metabolome mapping Γ from metabolomic and transcriptomic data using a sparse regression formulation.

Data processing

The inherent differences in data acquisition methods for the transcriptome and metabolome result in different data modalities. While transcriptomic data is often available at single-cell resolution, metabolomic data are typically collected in bulk. To maintain consistency in data modality for the inference of Γ, we aggregate the scRNA-seq data by averaging the gene expression levels across cells measured at the same time point. This procedure provides a bulk-like gene expression matrix, denoted by $\bar{G}=[{\bar{{\boldsymbol{G}}}}_{1},\ldots ,{\bar{{\boldsymbol{G}}}}_{T}]\in {{\mathbb{R}}}_{\ge 0}^{{n}_{{\rm{g}}}\times T}$, which is compatible with the metabolomic data matrix M in the regression problem. Another essential component of this step is data normalisation. Transcriptomic and metabolomic data often differ substantially in scale, which can cause numerical instability. To address this challenge, we normalise the dynamic range of each gene and metabolite to one (e.g., $\mathop{\max }\nolimits_{i}{\bar{G}}_{ij}-\mathop{\min }\nolimits_{i}{\bar{G}}_{ij}=1$).

Constructing the metabolic network

The regression problem for inferring matrices A_mm and A_mg is highly underdetermined: the number of possible molecular interactions far exceeds the number of experimental samples. To reduce the degree of underdetermination, a key assumption in the inference of A_mm and A_mg is that the topology of the metabolic network is known, but the interaction weights within this topology are not. To determine the topology of the metabolic network, we curated the Human 1 GEM³⁷ comprising over 10,000 metabolites, 3500 genes, and 13,000 metabolic reactions among these molecules (see Supplementary Material, Section 1, for details). From this model, we determine the nonzero structure of the interaction matrices A_mm and A_mg. This procedure assigns the positions of the nonzero elements in both matrices, while the actual weights of these interactions are estimated by solving the regression problem. Sensitivity analysis of the network inference results with respect to gaps and errors in the prior metabolic network topology can be found in Supplementary Material, Section 7 and Fig. S6.

Sparse regression

We now infer the parameters of matrices A_mm and A_mg in the linear algebraic equation (2b). To prevent trivial solutions where all coefficients are zero (i.e., A_mm = A_mg = 0), we employ the decomposition ${A}_{{\rm{mm}}}=-{I}_{{n}_{{\rm{m}}}}+{\bar{A}}_{{\rm{mm}}}$, where ${\bar{A}}_{{\rm{mm}}}$ contains only off-diagonal elements and ${I}_{{n}_{{\rm{m}}}}$ is an identity matrix of size n_m. Thus, Eq. (2a) is rewritten as

$${\boldsymbol{m}}={\bar{A}}_{{\rm{mm}}}{\boldsymbol{m}}+{A}_{{\rm{mg}}}{\boldsymbol{g}}+{{\boldsymbol{b}}}_{{\rm{m}}}.$$

(5)

Note that, due to the aforementioned decomposition, the coefficients in ${\bar{A}}_{{\rm{mm}}}$, A_mg, and b_m are appropriately scaled by the inverse of the diagonal entries of A_mm. The nonzero elements of ${\bar{A}}_{{\rm{mm}}}$, A_mg, and b_m are then estimated through a linear regression approach, formulated as M = XΘ. Here, $X=[M\,\,\,\bar{G}]$ is the matrix of predictor variables, and Θ is the vector of unknown regression coefficients corresponding to the previously assigned nonzero elements of ${\bar{A}}_{{\rm{mm}}}$, A_mg, and b_m.

Our approach assumes that the time-series sampling typically consists of experimental designs with two conditions (e.g., mutant and control) and 4–6 time points per experiment. Given the sparse nature of metabolic networks, it is expected that most metabolites are regulated by fewer than 12 interactions (which defines an upper bound based on the combination of 6 time points across 2 conditions). However, when metabolites have a large number of regulators or the number of samples is small, the regression problem can still be underdetermined. To mitigate this issue, we solve the following sparse regression problem

$$\min \parallel X\Theta -M{\parallel}_2^2+\lambda \parallel \Theta {\parallel }_{2}^{2},$$

(6)

where the regularisation term shrinks the regression coefficients towards zero, ensuring a unique sparse solution. This step yields the coefficients Θ, which are then integrated in their corresponding positions in A_mm and A_mg and used to derive $\Gamma =-{A}_{{\rm{mm}}}^{-1}{A}_{{\rm{mg}}}$.

We note that this linear formulation may overlook nonlinear metabolite-gene regulation. Yet, nonlinear methods proved computationally intractable and prone to overfitting in high-dimensional settings (as shown in Supplementary Material, Section 8 and Fig. S7).

Network inference via Bayesian regression

We infer the multi-omic regulatory network from the time-series scRNA-seq data and the previously inferred transcriptome–metabolome mapping. Following Eq. (1), the dynamics of the gene expression are modelled as a nonlinear stochastic differential equation

$$\dot{{\boldsymbol{g}}}={\boldsymbol{f}}({\boldsymbol{g}},{\boldsymbol{m}},{{\boldsymbol{b}}}_{{\rm{g}}};{\boldsymbol{\theta }})+{\boldsymbol{\rho }}({\boldsymbol{g}},{\boldsymbol{m}}){\boldsymbol{w}},$$

where f is a function of gene expression and metabolite concentrations, modelled as a collection of GPs, ${\boldsymbol{f}}={[{{\boldsymbol{f}}}_{1},\ldots ,{{\boldsymbol{f}}}_{{n}_{{\rm{g}}}}]}^{\top }$, and w is a white noise with covariance $Q={\rm{diag}}({q}_{1},\ldots ,{q}_{{n}_{{\rm{g}}}})$.

MINIE constructs the posterior distribution p(θ∣G) for model parameters θ, given the observed scRNA-seq data G. To obtain a tractable distribution, latent variables in the form of the continuous gene expression trajectory g and the pseudotime variable τ are introduced and integrated out by Monte Carlo integration:

$$\begin{array}{ll}p({\boldsymbol{\theta }}| G)\,=\,\iint \,p({\boldsymbol{\theta }},{\boldsymbol{g}},{\boldsymbol{\tau }}| G){\rm{d}}{\boldsymbol{g}}\,{\rm{d}}{\boldsymbol{\tau }}\\\qquad\qquad\,\propto \,p(\theta )\iint \,p(G| {\boldsymbol{g}},{\boldsymbol{\tau }},{\boldsymbol{\theta }})p({\boldsymbol{g}}| \theta )p({\boldsymbol{\tau }}){\rm{d}}{\boldsymbol{g}}\,{\rm{d}}{\boldsymbol{\tau }}.\end{array}$$

Here, p(G∣g, τ, θ) is the measurement model defined below; p(g∣θ) is the prior for gene expression trajectories, governed by the GP model in Eq. (1)a; and p(τ) represents the prior distribution over the pseudotime variables, incorporating information from the experimental sampling times.

MINIE’s inference procedure iteratively alternates between sampling the pseudotime estimate, the network topology, and the continuous-time gene trajectories. Three distinct MCMC samplers are employed in this procedure, as described below, whereas the required priors are reported in the Supplementary Material, Section 5. The convergence of the MCMC sampling in the case of the nonlinear multi-omic network represented in Fig. 3 and for the PD case in Fig. 7 is reported in the Supplementary Material, Fig. S11. With high-dimensional problems (n > 50), the convergence is slower. To help in exploration, it is recommended to conduct several MCMC runs (in parallel) and combine the results by taking the average over the confidence matrices. For example, our results on the PD data were obtained from 28 parallel chains.

Data processing

To account for outliers and higher variability in single-cell data compared to bulk data, the normalisation used for single-cell data is slightly different. With single-cell data, the expression levels of all genes are normalised such that the difference between the 95th and 5th percentiles is one. In order to maintain correct scaling of the transcriptome–metabolome mapping, each column of Γ must be scaled by the inverse of the scaling factor for the corresponding gene. Finally, to scale inferred metabolite levels to transcriptomic levels, Γ is normalised row-wise such that the difference of the 95th and 5th percentiles of each row of the product ΓG is one.

Pseudotime sampling

The integration of scRNA-seq data as input is based on the concept of pseudotime. Pseudotime represents a latent temporal dimension, capturing the progression of cell states along a biological process⁷⁶. Unlike traditional GRN inference methods that require the computation of pseudotime prior to the network inference step⁴⁰, MINIE integrates pseudotime as a parameter to be sampled concurrently with other model parameters. This procedure is detailed as follows.

MINIE models the single-cell gene expression data G as samples drawn from the continuous trajectory g, where the pseudotime τ serves as the sampling time:

$${{\boldsymbol{G}}}_{j}={\boldsymbol{g}}({{\boldsymbol{\tau }}}_{j})+{{\boldsymbol{v}}}_{j},$$

(7)

where G_j denotes the gene expression profile of cell j, τ_j is the associated pseudotime at which the gene expression state ${\boldsymbol{g}}(t)$ is sampled along the trajectory, ${{\boldsymbol{v}}}_{j} \sim {\mathcal{N}}({0}_{{n}_{{\rm{g}}}\times 1},R)$ accounts for measurement noise, and $R=\,\text{diag}\,({r}_{1},\ldots ,{r}_{{n}_{{\rm{g}}}})$ denotes a diagonal covariance matrix. Accordingly, τ = {τ₁, …, τ_Y} is the collection of pseudotimes for Y cells. Pseudotimes are then sampled using a dedicated Metropolis–Hastings MCMC sampler with random walk proposals from the posterior distribution p(τ∣G, g, θ) ∝ p(G∣g, τ, θ)p(τ). The measurement model is given by $p(G| {\boldsymbol{g}},{\boldsymbol{\tau }},{\boldsymbol{\theta }})=\mathop{\prod }\nolimits_{j = 1}^{Y}{\mathcal{N}}({{\boldsymbol{G}}}_{j};{\boldsymbol{g}}({{\boldsymbol{\tau }}}_{j}),R)$. A Gaussian prior p(τ) is imposed on the pseudotimes, with a mean centred on the experimentally observed sampling time of each cell and a variance based on the measurement time intervals. Sensitivity analysis on the pseudotime variance can be found in Supplementary Material, Section 7 and Fig. S6. A weighting scheme for the measurement model to account for zero inflation in single-cell data is described in the Supplementary Material, Section 5.

Network topology sampling

Network inference is performed using a Bayesian framework to estimate the model parameters θ, including the network topology. The mean function of the GP f in Eq. (1) is defined componentwise as

$${{\boldsymbol{\mu }}}_{i}({\boldsymbol{g}})={{\boldsymbol{b}}}_{i}-{{\boldsymbol{a}}}_{i}{{\boldsymbol{g}}}_{i},$$

(8)

where a_i and b_i represent the rates of mRNA degradation and basal transcription, respectively. The covariance function is given by

$${K}_{i}({\boldsymbol{x}},{\boldsymbol{z}})={\gamma }_{i}\exp \left\{-\mathop{\sum }\limits_{j=1}^{n}{\beta }_{ij}{({{\boldsymbol{x}}}_{j}-{{\boldsymbol{z}}}_{j})}^{2}\right\},$$

(9)

where x = [g_x, m_x] and z = [g_z, m_z]. If an external perturbation is considered, then x = [g_x, m_x, b_g] (and z) concatenates the concentrations of all molecules and the possible perturbation to gene expression dynamics b_g, in which case the summation index j goes up to n + 1. The mapping Γ is applied to gene trajectory samples to estimate metabolite trajectories at the single-cell level: m = Γg − Δb_m. This approach integrates the bulk metabolomic data with single-cell transcriptomic data.

The parameters β_ij quantify the regulatory influence of molecule j on gene i. We factorise it as β_ij = S_ijH_ij, where S_ij is the binary indicator for the existence of a link from molecule j to i and H_ij ≥ 0 determines the weight of that link. The Bayesian estimation of parameters β_ij is performed by sampling the indicator variable S_ij, the continuous parameters H_ij, and other model parameters included in ${\boldsymbol{\theta }}={\{{S}_{ij},{H}_{ij},{\gamma }_{i},{{\boldsymbol{b}}}_{i},{{\boldsymbol{a}}}_{i},{q}_{i},{r}_{i}\}}_{i = 1,\ldots ,{n}_{{\rm{g}}},j = 1,\ldots ,n}$, using dedicated Metropolis–Hastings MCMC samplers from the posterior distribution p(G∣g, τ, θ)p(g∣θ)p(θ). This posterior distribution can be factorised into n_g components such that each component depends only on the hyperparameters θ for one target gene at a time. Therefore, the topology sampling can be performed one target gene at a time. Random walk proposals are used for sampling the continuous parameters. For the topology, we randomly choose between two proposal moves. In the first proposal move, for target gene i, an element S_i,j is randomly chosen from the ith row of S, and it is changed to 1 − S_i,j (that is, from one to zero or zero to one). In the second proposal move, one existing link is replaced by another.

The prior for the network topology can be used to adjust network sparsity and to incorporate prior knowledge on the network topology. Sensitivity analysis on the network sparsity level can also be found in Supplementary Material, Section 7 and Fig. S6.

Trajectory sampling

MINIE manages the high variability and noise levels characteristic of scRNA-seq data by sampling gene expression trajectories g from the posterior distribution p(g∣θ, G, τ) ∝ p(G∣g, τ, θ)p(g∣θ). Here, p(g∣θ) is the prior probability distribution for the gene expression trajectories that depends on the model parameters, notably on the network topology. We note that this distribution can be analytically calculated for a discretised trajectory due to the GP formalism used [³, Supplementary Note 3]. Given the high cell variability in scRNA-seq data, a single trajectory cannot fit the whole data. Thus, the trajectory sampling is used to better explore plausible gene expression trajectories. Trajectories are sampled using a dedicated MCMC sampler based on Crank–Nicolson sampling^77,78 as described in Ref. [³, Suppl. Note 7].

Output

The MCMC sampler (dedicated to the network topology) generates a series of network topologies, encoded by adjacency matrix samples S^(l), over multiple iterations l = 1, …, N_iter. These topologies are then averaged to produce the confidence matrix $C=\frac{1}{{N}_{{\rm{iter}}}}\mathop{\sum }\nolimits_{l = 1}^{{N}_{{\rm{iter}}}}{S}^{(l)}$. The dimension of the confidence matrix is n_g × n or n_g × (n + 1) if a perturbation is included. The first n_g columns correspond to gene-gene interactions, the next n_m columns correspond to metabolite-to-gene interactions, and in case a perturbation is included, the last column of the matrix corresponds to direct perturbation targets.

Experimental data preprocessing

Control and PINK1-mutant cell lines were studied through a cell differentiation process from iPSCs to dopaminergic neurons. The PINK1 dataset involved iPSCs carrying the patient-based homozygous mutation ILE368ASN in the PINK1 gene, whereas control cells were obtained from age- and sex-matched individuals. Measurements for both samples occurred on days 0, 8, 18, 25, 32 and 37, which were used to generate the (time-series) scRNA-seq data and bulk metabolomic data. For both control and mutant cases, three biological replicates for metabolomic data were collected.

scRNA-seq processing

The preprocessing steps for scRNA-seq data focused on retaining high-quality cells and genes while removing uninformative data that could affect downstream analyses. Low-quality cells were identified based on three criteria: (1) the number of expressed genes per cell had to exceed 200 and be more than 2 median absolute deviations (MAD) above the median, (2) the total number of counts per cell had to be 2 MAD above or below the median, and (3) the percentage of mitochondrial gene counts had to be less than 1.5 MAD above the median. Cells that failed any of these criteria were classified as low-quality and excluded from further analysis. Genes expressed in fewer than 10 cells were also removed from the dataset. Doublets (i.e., cases where two cells are mistakenly processed as one) were identified and removed using Scrublet⁷⁹, a nearest-neighbour classifier that simulates transcriptomic profiles to predict doublets. Identified doublets were removed to maintain the integrity of single-cell profiles.

To reduce technical variation and ensure accurate comparisons between cells, gene expression counts were normalised by adjusting feature expression for each cell by the median of the total counts, followed by log transformation. This approach mitigates the effects of differences in sequencing depth while preserving the signal information. After normalisation, the top 500 most dynamically variable genes were identified by calculating the Wasserstein distances of the single-gene expression distributions between consecutive time points. The sum of Wasserstein distances across time points for both control and mutant samples was used to rank genes by their dynamic variability. Genes exhibiting the highest variability were selected for use as input data to MINIE. The PINK1 gene was also added to the list of genes due to its relevance to the study.

Metabolomic data processing

The metabolomic data were processed independently by the metabolic platform at the Luxembourg Centre for Systems Biomedicine. The platform identified and quantified metabolites, producing concentration matrices for each condition. After data cleaning, the metabolomics dataset consisted of 111 identified metabolites, with four duplicates (L-tryptophan, 2-hydroxyglutarate, 2-oxoglutarate, and GABA) removed from the dataset. Given the small size of the metabolomic data relative to the transcriptomic data, no further filtering was applied, and all metabolites were input to MINIE.

Data availability

Codes of MINIE's implementation have been deposited in our GitLab repository (https://gitlab.com/uniluxembourg/lcsb/systems-control/minie), along with scripts to reproduce our results. The experimental PINK1 dataset [64] is publicly available at https://doi.org/10.5281/zenodo.11396807. The synthetic and curated benchmark datasets based on the BEELINE study [40] are available at https://zenodo.org/records/3701939, whereas the experimental scRNA-seq datasets were downloaded from Gene Expression Omnibus: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE81252 (hHEP), http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75748 (hESC), http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE98664 (mESC), http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48968 (mouse dendritic cell), and http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE81682 (mHSC).

References

Baysoy, A., Bai, Z., Satija, R. & Fan, R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 24, 695–713 (2023).
Article CAS PubMed Google Scholar
Forrow, A. & Schiebinger, G. LineageOT is a unified framework for lineage tracing and trajectory inference. Nat. Commun. 12, 4940 (2021).
Article CAS PubMed PubMed Central Google Scholar
Aalto, A., Viitasaari, L., Ilmonen, P., Mombaerts, L. & Gonçalves, J. Gene regulatory network inference from sparsely sampled noisy data. Nat. Commun. 11, 3493 (2020).
Article CAS PubMed PubMed Central Google Scholar
Volkova, S., Matos, M. R., Mattanovich, M. & de Mas, I. M. Metabolic modelling as a framework for metabolomics data integration and analysis. Metabolites 10, 303 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yasemi, M. & Jolicoeur, M. Modelling cell metabolism: a review on constraint-based steady-state and kinetic approaches. Processes 9, 322 (2021).
Article CAS Google Scholar
Yugi, K., Kubota, H., Hatano, A. & Kuroda, S. Trans-omics: how to reconstruct biochemical networks across multiple ’omic’ layers. Trends Biotechnol. 34, 276–290 (2016).
Article CAS PubMed Google Scholar
Jendoubi, T. Approaches to integrating metabolomics and multi-omics data: a primer. Metabolites 11, 184 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mohaiminul Islam, M. et al. An integrative deep learning framework for classifying molecular subtypes of breast cancer. Comput. Struct. Biotechnol. J. 18, 2185–2199 (2020).
Article CAS PubMed PubMed Central Google Scholar
Clark, C., Dayon, L., Masoodi, M., Bowman, G. L. & Popp, J. An integrative multi-omics approach reveals new central nervous system pathway alterations in Alzheimer’s disease. Alzheimers Res. Ther. 13, 71 (2021).
Article CAS PubMed PubMed Central Google Scholar
Murphy, K. et al. Integrating biomarkers across omic platforms: an approach to improve stratification of patients with indolent and aggressive prostate cancer. Mol. Oncol. 12, 1513–1525 (2018).
Article CAS PubMed PubMed Central Google Scholar
Dugourd, A. & Saez-Rodriguez, J. Footprint-based functional analysis of multiomic data. Curr. Opin. Syst. Biol. 15, 82–90 (2019).
Article PubMed PubMed Central Google Scholar
Runge, J. Causal network reconstruction from time series: from theoretical assumptions to practical estimation. Chaos 28, 075310 (2018).
Article CAS PubMed Google Scholar
Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 113, 3932–3937 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mangan, N. M., Brunton, S. L., Proctor, J. L. & Kutz, J. N. Inferring biological networks by sparse identification of nonlinear dynamics. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2, 52–63 (2016).
Article Google Scholar
Marku, M. & Pancaldi, V. From time-series transcriptomics to gene regulatory networks: a review on inference methods. PLoS Comput. Biol. 19, e1011254 (2023).
Article CAS PubMed PubMed Central Google Scholar
Matsumoto, H. et al. SCODE: An efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 33, 2314–2321 (2017).
Article PubMed PubMed Central Google Scholar
Sanchez-Castillo, M., Blanco, D., Tienda-Luna, I. M., Carrion, M. C. & Huang, Y. A Bayesian framework for the inference of gene regulatory networks from time and pseudo-time series data. Bioinformatics 34, 964–970 (2018).
Article CAS PubMed Google Scholar
Frankowski, P. C. A. & Vert, J. P. Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference. Bioinformatics 36, 4774–4780 (2020).
Article Google Scholar
Mao, G., Pang, Z., Zuo, K. & Liu, J. Gene regulatory network inference using convolutional neural networks from scrna-seq data. J. Comput. Biol. 30, 619–631 (2023).
Article CAS PubMed Google Scholar
Huynh-Thu, V. A. & Geurts, P. DynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data. Sci. Rep. 8, 3384 (2018).
Article PubMed PubMed Central Google Scholar
Aderhold, A., Husmeier, D. & Grzegorczyk, M. Approximate Bayesian inference in semi-mechanistic models. Stat. Comput. 27, 1003–1040 (2017).
Article PubMed Google Scholar
Casadiego, J., Nitzan, M., Hallerberg, S. & Timme, M. Model-free inference of direct network interactions from nonlinear collective dynamics. Nat. Commun. 8, 2192 (2017).
Article PubMed PubMed Central Google Scholar
Chan, T. E., Stumpf, M. P. & Babtie, A. C. Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst. 5, 251–267.e3 (2017).
PubMed PubMed Central Google Scholar
Papili Gao, N., Ud-Dean, S. M., Gandrillon, O. & Gunawan, R. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics 34, 258–266 (2018).
Article PubMed Google Scholar
Aalto, A., Lamoline, F. & Gonçalves, J. Linear system identifiability from single-cell data. Syst. Control Lett. 165, 105287 (2022).
Article Google Scholar
Lamoline, F., Haasler, I., Karlsson, J., Gonçalves, J. & Aalto, A. Dynamic gene regulatory network inference from single-cell data using optimal transport. Bioinformatics 41, btaf394 (2025).
Article CAS PubMed PubMed Central Google Scholar
Shamir, M., Bar-On, Y., Phillips, R. & Milo, R. Snapshot: timescales in cell biology. Cell 164, 1302–1302.e1 (2016).
Article PubMed Google Scholar
Kamimoto, K. et al. Dissecting cell identity via network inference and in silico gene perturbation. Nature 614, 742–751 (2023).
Article CAS PubMed PubMed Central Google Scholar
Ma, A. et al. Single-cell biological network inference using a heterogeneous graph transformer. Nat. Commun. 14, 964 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kartha, V. K. et al. Functional inference of gene regulation using single-cell multi-omics. Cell Genom. 2, 100166 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ogris, C., Hu, Y., Arloth, J. & Müller, N. S. Versatile knowledge guided network inference method for prioritizing key regulatory factors in multi-omics data. Sci. Rep. 11, 6806 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jagtap, S., Pirayre, A., Bidard, F., Duval, L. & Malliaros, F. D. BRANEnet: embedding multilayer networks for omics data integration. BMC Bioinform. 23, 429 (2022).
Article CAS Google Scholar
Jagtap, S. et al. Multiomics data integration for gene regulatory network inference with exponential family embeddings. In 2021 29th European Signal Processing Conference (EUSIPCO), Dublin, Ireland https://doi.org/10.23919/EUSIPCO54536.2021.9616279 1221–1225 (IEEE, 2021).
Kleessen, S., Irgang, S., Klie, S., Giavalisco, P. & Nikoloski, Z. Integration of transcriptomics and metabolomics data specifies the metabolic response of Chlamydomonas to rapamycin treatment. Plant J. 81, 822–835 (2015).
Article CAS PubMed Google Scholar
Alghamdi, N. et al. A graph neural network model to estimate cell-wise metabolic flux using single-cell RNA-seq data. Genome Res. 31, 1867–1884 (2021).
Article CAS PubMed PubMed Central Google Scholar
Deuflhard, P. & Röblitz, S. A Guide to Numerical Modelling in Systems Biology, Vol. 12 (Springer, 2015).
Robinson, J. L. et al. An atlas of human metabolism. Sci. Signal. 13, eaaz1482 (2020).
Article CAS PubMed PubMed Central Google Scholar
Montanari, A. N., Lamoline, F., Bereza, R. & Gonçalves, J. Identifiability of differential-algebraic systems. Preprint at https://arxiv.org/abs/2405.13818 2024.
Desvergne, B., Michalik, L. & Wahli, W. Transcriptional regulation of metabolism. Physiol. Rev. 86, 465–514 (2006).
Article CAS PubMed Google Scholar
Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A. & Murali, T. M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020).
Article CAS PubMed PubMed Central Google Scholar
Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).
Article PubMed Google Scholar
Link, H., Christodoulou, D. & Sauer, U. Advancing metabolic models with kinetic information. Curr. Opin. Biotechnol. 29, 8–14 (2014).
Article CAS PubMed Google Scholar
Moxley, J. F. et al. Linking high-resolution metabolic flux phenotypes and transcriptional regulation in yeast modulated by the global regulator Gcn4p. Proc. Natl. Acad. Sci. USA 106, 6477–6482 (2009).
Article CAS PubMed PubMed Central Google Scholar
Yildirim, N. & Mackey, M. C. Feedback regulation in the lactose operon: a mathematical modeling study and comparison with experimental data. Biophys. J. 84, 2841–2851 (2003).
Article CAS PubMed PubMed Central Google Scholar
Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl. Acad. Sci. USA 107, 6286–6291 (2010).
Article CAS PubMed PubMed Central Google Scholar
Marbach, D., Schaffter, T., Mattiussi, C. & Floreano, D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J. Comput. Biol. 16, 229–239 (2009).
Article CAS PubMed Google Scholar
Prill, R. J. et al. Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS ONE 5, e9202 (2010).
Article PubMed PubMed Central Google Scholar
Qiu, X. et al. Inferring causal gene regulatory networks from coupled single-cell expression dynamics using Scribe. Cell Syst. 10, 265–274 (2020).
Article CAS PubMed PubMed Central Google Scholar
Deshpande, A., Chu, L.-F., Stewart, R. & Gitter, A. Network inference with Granger causality ensembles on single-cell transcriptomics. Cell Rep. 38, 110333 (2022).
Article CAS PubMed PubMed Central Google Scholar
Kim, S. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun. Stat. Appl. Methods 22, 665 (2015).
PubMed PubMed Central Google Scholar
Huynh-Thu, V. A., Irrthum, A., Wehenkel, L. & Geurts, P. Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5, e12776 (2010).
Article PubMed PubMed Central Google Scholar
Specht, A. T. & Li, J. LEAP: constructing gene co-expression networks for single-cell RNA-sequencing data using pseudotime ordering. Bioinformatics 33, 764–766 (2017).
Article CAS PubMed Google Scholar
Moerman, T. et al. Grnboost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics 35, 2159–2161 (2019).
Article CAS PubMed Google Scholar
Aubin-Frankowski, P.-C. & Vert, J.-P. Gene regulation inference from single-cell RNA-seq data with linear differential equations and velocity inference. Bioinformatics 36, 4774–4780 (2020).
Article CAS PubMed Google Scholar
Woodhouse, S., Piterman, N., Wintersteiger, C. M., Göttgens, B. & Fisher, J. SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data. BMC Syst. Biol. 12, 1–7 (2018).
Article Google Scholar
Kazachenka, A. et al. Identification, characterization, and heritability of murine metastable epialleles: implications for non-genetic inheritance. Cell 175, 1259–1271 (2018).
Article CAS PubMed PubMed Central Google Scholar
Garcia-Alonso, L., Holland, C. H., Ibrahim, M. M., Turei, D. & Saez-Rodriguez, J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 29, 1363–1375 (2019).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Article CAS PubMed Google Scholar
Cantini, L. et al. Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nat. Commun. 12, 124 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kamal, A. et al. GRaNIE and GRaNPA: inference and evaluation of enhancer-mediated gene regulatory networks. Mol. Syst. Biol. 19, e11627 (2023).
Article CAS PubMed PubMed Central Google Scholar
Mihajlović, K., Malod-Dognin, N., Ameli, C., Skupin, A. & Pržulj, N. MONFIT: multi-omics factorization-based integration of time-series data sheds light on Parkinson’s disease. NAR Mol. Med. 1, ugae012 (2024).
Article Google Scholar
Barabási, A.-L., Gulbahce, N. & Loscalzo, J. Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011).
Article PubMed PubMed Central Google Scholar
Milenković, T. & Pržulj, N. Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6, CIN–S680 (2008).
Article Google Scholar
Mihajlović, K. et al. Multi-omics integration of scRNA-seq time series data predicts new intervention points for Parkinson’s disease. Sci. Rep. 14, 10983 (2024).
Article PubMed PubMed Central Google Scholar
Margolin, A. A. et al. Reverse engineering cellular networks. Nat. Protoc. 1, 662–671 (2006).
Article CAS PubMed Google Scholar
Yuan, Q. & Duren, Z. Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data. Nat. Biotechnol. 43, 247–257 (2024).
Article PubMed PubMed Central Google Scholar
Nguyen, H., Tran, D., Tran, B., Pehlivan, B. & Nguyen, T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief. Bioinform. 22, bbaa190 (2021).
Article PubMed Google Scholar
Shen, J. L., Fortier, T. M., Wang, R. & Baehrecke, E. H. Baehrecke, E. H. Vps13D functions in a Pink1-dependent and parkin-independent mitophagy pathway. J. Cell Biol. 220, e202104073 (2021).
Article CAS PubMed PubMed Central Google Scholar
Eldeeb, M. A. et al. Tom20 gates PINK1 activity and mediates its tethering of the TOM and TIM23 translocases upon mitochondrial stress. Proc. Natl. Acad. Sci. USA 121, eadn7191 (2024).
Article Google Scholar
Jiang, H. et al. Proteomic study of a parkinson’s disease model of undifferentiated SH-SY5Y cells induced by a proteasome inhibitor. Int. J. Med. Sci. 16, 84–92 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bouron, A. & Fauvarque, M. O. Genome-wide analysis of genes encoding core components of the ubiquitin system during cerebral cortex development. Mol. Brain 15, 1–24 (2022).
Article Google Scholar
Xu, J. et al. The protective effects of lncRNA ZFAS1/miR-421-3p/MEF2C axis on cerebral ischemia-reperfusion injury. Cell Cycle 21, 1915–1931 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mischley, L. K. et al. Central nervous system uptake of intranasal glutathione in Parkinson’s disease. npj Parkinson’s Dis. 2, 16002 (2016).
Article CAS Google Scholar
Takahashi, S. & Mashima, K. Neuroprotection and disease modification by astrocytes and microglia in Parkinson disease. Antioxidants 11, 1–18 (2022).
Article Google Scholar
Kelly, R. T. Single-cell proteomics: progress and prospects. Mol. Cell. Proteom. 19, 1739–1748 (2020).
Article CAS Google Scholar
Trapnell, C. Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015).
Article CAS PubMed PubMed Central Google Scholar
Beskos, A., Roberts, G., Stuart, A. & Voss, J. MCMC methods for diffusion bridges. Stoch. Dyn. 8, 319–350 (2008).
Article Google Scholar
Cotter, S., Roberts, G., Stuart, A. & White, D. MCMC methods for functions: modifying old algorithms to make them faster. Stat. Sci. 28, 424–446 (2013).
Article Google Scholar
Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst 24, 281–291.e9 (2019).
Article Google Scholar

Download references

Acknowledgements

This work is supported by Luxembourg National Research Fund (FNR), grants No. 19/14063202/ACTIVE (M.M.G.) and CORE19/13684479/DynCell (A.A.).

Author information

Authors and Affiliations

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
María Moscardó García, Atte Aalto, Arthur N. Montanari, Alexander Skupin & Jorge Gonçalves
Center for Network Dynamics, Northwestern University, Evanston, IL, USA
Arthur N. Montanari
Department of Physics and Astronomy, Northwestern University, Evanston, IL, USA
Arthur N. Montanari
Department of Physics and Material Sciences, University of Luxembourg, Belvaux, Luxembourg
Alexander Skupin
Department of Neurosciences, University of California, San Diego, USA
Alexander Skupin
Department of Plant Sciences, Cambridge University, Cambridge, UK
Jorge Gonçalves

Authors

María Moscardó García
View author publications
Search author on:PubMed Google Scholar
Atte Aalto
View author publications
Search author on:PubMed Google Scholar
Arthur N. Montanari
View author publications
Search author on:PubMed Google Scholar
Alexander Skupin
View author publications
Search author on:PubMed Google Scholar
Jorge Gonçalves
View author publications
Search author on:PubMed Google Scholar

Contributions

A.A., M.M.G., A.N.M. and J.G. designed the research; M.M.G. and A.A. implemented the method; M.M.G. performed the numerical simulations; M.M.G. analysed the data; A.S. led the collection of the experimental data and provided interpretation of the biological relevance; M.M.G. and A.N.M. led the writing of the manuscript; all authors contributed to the interpretation of the results and editing of the final version of the paper.

Corresponding author

Correspondence to Jorge Gonçalves.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Moscardó García, M., Aalto, A., Montanari, A.N. et al. Multi-omic network inference from time-series data. npj Syst Biol Appl 11, 114 (2025). https://doi.org/10.1038/s41540-025-00591-1

Download citation

Received: 16 February 2025
Accepted: 04 September 2025
Published: 14 October 2025
DOI: https://doi.org/10.1038/s41540-025-00591-1

Subjects

Abstract

Similar content being viewed by others

Recovering biomolecular network dynamics from single-cell omics data requires three time points

Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data

SignalingProfiler 2.0 a network-based approach to bridge multi-omics data to phenotypic hallmarks

Introduction

Results

Method overview

Proof-of-concept on a multi-layer network motif

Case studies on nonlinear multi-omic models

Curated multi-omic network

Lac operon model

Benchmarking MINIE against published algorithms

Multi-omic dataset benchmarking

Single-cell dataset benchmarking

Experimental validation on Parkinson’s disease data

Discussion

Methods

MINIE’s algorithm

Notation

Transcriptome–metabolome mapping inference

Data processing

Constructing the metabolic network

Sparse regression

Network inference via Bayesian regression

Data processing

Pseudotime sampling

Network topology sampling

Trajectory sampling

Output

Experimental data preprocessing

scRNA-seq processing

Metabolomic data processing

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links