Fig. 5: Using network modularity as relevance prior in the informed propagation algorithm for gene prioritization.
From: Network analysis reveals rare disease signatures across multiple levels of biological organization

a Schematic overview of the informed multiplex network propagation algorithm that incorporates modularity as measure of relevance of a particular network level for a given disease group. b Comparison of 10-fold cross-validation performance in rare disease gene retrieval for different choices of included networks: Informed algorithm with most relevant network (blue), all networks (green), the PPI (red), and the single most relevant layer for each disease (yellow). Dashed lines show median value across all folds, shaded areas represent the interquartile range. The retrieval performance indicates that disease mechanisms are generally better recapitulated by incorporating relevant networks only. c Comparison of the AUROC from all four methods. Utilizing the significant networks lead to more accurate disease gene retrieval compared to all networks, the single most relevant layer, or the PPI. (Bonferroni-Holm corrected Durbin-Conover test p-value = 0.026, 1.22e−6, and 3e-16 respectively). Threshold for p-values: p < 0.05:*, p < 0.01:**, p < 0.001:***, p < 0.0001:****; n = 26 rare disease groups across all network sets. Bounds of box represent 25th and 75th percentiles, center the median, whiskers 10th and 90th percentiles, respectively. d Factors correlated with the retrieval performance. The algorithm that incorporates all networks can outperform the informed algorithm for diseases with high levels of syndromicity, i.e., disease that manifest in multiple physiological systems (left, Spearman’s ρ = −0.53, corresponding p-value = 0.004). Decreasing functional relevance as the number of genes increases also led to lower predictive performance (right, Spearman’s ρ = −0.83, p-value = 1.94e-6). The corresponding p-value of correlation was determined by Fisher z-transformation, two-sided.