Table 3 Key methodological instantiations for the metamodel

From: Computational whole-body-exposome models for global precision brain health

A. Data heterogeneity

Technique (example)

Required data

Complexity

Interpretability

Computational cost

Multimodal integration

Robustness

Large-Scale deep learning (accelerated brain age in large cohorts)10

Large annotated and balanced datasets required

High complexity

Limited and indirect

High computational demands for model training

Integration possible, but not directly built in the model

High, when trained with sufficient data and/or data augmentation

Normative modeling (population reference distributions)98

Requires large samples with good reference data coverage across diversity axes

Variable, depends on the specific model

Supports subject-level interpretation

Computationally expensive for high-dimensional data

Variable, depends on the specific model

Inherently models uncertainty and robust to heterogeneities

Federated learning (distributed training to keep privacy)282

Distributed data requirements

Variable, depends on the specific model

Variable, depends on the specific model

Convergence issues with non-independent and identically distributed data, limited by the least resourced sites

Variable, depends on the specific model

Enhances representation of diverse populations, mitigates site-specific biases, evolves as new data is available

B. Multimodal integration (brain + extracerebral metrics)

Technique (example)

Required data

Complexity

Interpretability

Computational cost

Multimodal integration

Robustness

Whole-brain biophysical modeling (perturbations and extracerebral modulation)283

Low, allows single-subject modeling

Variable, but limited by high computational cost

Both mechanistic and causal interpretability

High computational demands for model training

Can integrate systemic and environmental modulators via priors or order parameters

Sensitive to assumptions about coupling between neural regions

Dynamic causal modeling (DCM, neuropharmacological dynamics)284

Low, allows single-subject modeling

Variable, but limited by high computational cost

Causal interpretability of directed brain connectivity, rigorous Bayesian model comparison, physiologically meaningful parameters

High computational demands for parameter optimization

Supports multimodal extensions

Requires strong a priori model specification, sensitive to model specification details

Active inference (Generative brain-body models)285

Low, allows single-subject modeling

Variable, but limited by high computational cost

Limited empirical application, abstract constructs require biological mapping.

High computational demands for parameter optimization

Unifies perception, action, and regulation

Explicit uncertainty modeling but difficult model specification

Deep multimodal learning (joint representation models)286

Requires large multimodal datasets

High complexity, can approximate arbitrary non-linear decision functions

Limited and indirect

High computational demands for model training

Captures non-linear cross-modal interactions

Prone to modality dominance, limited native uncertainty estimation

C. Individual level trajectories

Technique (example)

Required data

Complexity

Interpretability

Computational cost

Multimodal integration

Robustness

Bayesian sequential Inference (state-space models)287

High for high-dimensional or poorly sampled systems

High model complexity

Interpretability in terms of personalized trajectories

High computational demands for high-dimensional systems

Integration possible, but not directly built in the model

Adaptive updating with new observations, explicit uncertainty quantification, handles irregular sampling and noise

Markov chain models (e.g., Markov & hidden Markov models for spatiotemporal brain dynamics)288

Large amounts of data needed for rare transitions

Low complexity, memoryless assumption can oversimplify dynamics, discretization may lose information

Simple interpretability, analytically tractable, clear clinical stage modeling

Low computational demands

Integration possible, but not directly built in the model

Sensitive to model initialization

  1. Description of the dimensions used in the table:
  2. Required data: Reflects both the amount and organization of data required to train the models. Ranges from individual datasets of relatively low complexity to large structured and annotated databases spanning thousands of individual participants.
  3. Complexity: The capacity of the model to detect complex and non-linear patterns in the data. While complex models are capable of more powerful generalizations, they are computationally demanding, and parameter optimization is data intensive. A downside of complex models is their tendency to overfit if trained with insufficient data, or when lacking adequate regularization.
  4. Interpretability: The extent of understanding gained by analyzing the models. Interpretable models yield valuable insights concerning the underlying mechanisms driving model predictions and can also inform neurobiological and causal relationships in the data.
  5. Computational cost: Reflects the combined amount of processing power and data storage demand required by each technique.
  6. Multimodal integration: The capacity of the model to process inputs combining dissimilar data sources, for instance, different neuroimaging modalities. While some models can trivially integrate data sources when combined in the inputs, others offer more parsimonious and streamlined integration capabilities.
  7. Robustness: Refers to the capacity of the model to yield meaningful predictions facing noisy, uncertain and/or heterogeneous data, including missing and/or potentially misleading samples. Robust models are also stable against methodological choices, such as the specification of hyper-parameters and prior assumptions, and can be easily adapted to handle new data as it becomes available.