Extended Data Fig. 5: Schematic overview of the Bayesian nonnegative matrix factorization clustering algorithm.
From: Genetic analysis of dietary intake identifies new loci and functional links with metabolic traits

The input for the Bayesian nonnegative matrix factorization clustering algorithm (bNMF) was the set of 31 genetic variants reaching nominal significance association with proportion fat intake. Next summary association statistics for 22 dietary intake traits from the UK Biobank were aggregated for each dietary intake variant. Our analyses involved variants aligned by their alleles associated with increased fat intake. We generated standardized effect sizes for variant trait associations from GWAS by dividing the estimated regression coefficient beta by the standard error, using the UK Biobank summary statistic results (variant-trait association matrix (31 by 22)). The defining features of each cluster were determined by the most highly associated traits, which is a natural output of the bNMF approach. bNMF algorithm was performed in R for 1,000 iterations with different initial conditions, and the maximum posterior solution at the most probable number of clusters was selected for downstream analysis.