Figure 4

Effects of variable adjustment on univariate or MGM association analysis in the training set. (a) “top assoc.” shows the distribution of −log10(p-values) derived from a univariate regression. Here, we calculated p-values between all possible pairs of variables and collected all top associations. “top neigh.” shows the analogous distribution, where the top feature was selected by largest absolute edge weight in the MGM neighborhood. (b) The corresponding plot, where the p-values were corrected by the top five confounder variables of the univariate and MGM screening, respectively. (c) The corresponding plot, where we adjusted for the same five randomly selected features for both methods. (d–f) Show the differences between “top neigh.” and “top assoc.” in (a) to (c), respectively: (d) shows the −log10(p-values) of the MGM approach minus those of the univariate screening in (a), (e) shows the corresponding plot after adjusting for the respective top confounders, as shown in (b), and (f) shows the corresponding plot after adjusting for the randomly selected confounders, as shown in (c). The red points in each figure contrast the values on the y-axis with their respective rank. On the x-axis, the highest positive difference corresponds to 1 and the most negative to 0. The green shaded areas correspond to rank percentiles of negative, the violet shaded areas correspond to rank percentiles of positive differences, respectively.