Table 3 Overview of the research questions, corresponding methodological approaches, selected models, and performance metrics used in this benchmark

From: A systematic benchmark of integrative strategies for microbiome-metabolome data

Scientific question

Research aim

Model selection

Selected models

Performance metrics

Is there any relationship between microorganisms and metabolites at a global level?

Global associations

Need to adjust for covariates

YES: Regression-based models

MMiRKAT

Type-I error rate, Power

NO: Bi-directional models

Mantel test, Procrustes analysis

Are microbiome and metabolome datasets summarizable through a limited number of components?

Data summarization

Does the directionality matter

YES: Regression-based models

PLS-Regression RDA

% Explained Variance

NO: Canonical-based models

PLS-Canonical CCA MOFA2

Can we identify associations between metabolites and species?

Individual associations

Need to adjust for covariates

YES: Regression-based models

Log-Contrast MiRKAT CLR-LM

Type-I error rate, Power

NO: Correlation-based models

Pearson Spearman HALLA

Can we identify core microorganisms and metabolites?

Feature selection

Need to account for the between-within correlation

YES: Univariate models

CODA-LASSO CLR-(M)LASSO

Sparsity sensitivity specificity

NO: Multivariate-models

sPLS-Regression sPLS-Canonical sCCA

  1. Methods are categorized based on their ability to assess global associations, summarize data, identify individual associations, or perform feature selection.