Fig. 1: GWASs of HMOs in 980 mothers of the CHILD cohort study.

a Overlayed Manhattan plots from GWASs of 19 HMOs (linear regression analyses in PLINK): significant association of SNPs across four chromosomal regions, the strongest association at rs492602 on chromosome 19q13.33 for LNFPI (P = 2.38e–118). Additional associations were detected on chromosomes 19p13.3, 3q27.3 and 10q22.1 (Supplementary Data 1). The x-axis indicates the chromosomal position, and the y-axis indicates the significance of SNP associations (-log10(P)). The red line represents the genome-wide significance P-value threshold (P < 5e–8). The orange highlight SNPs within each of the significant regions. b Overlayed regional plots of SNP associations on chromosome 19: significant associations were detected for 18 of the 19 individual HMOs (all except 6’SL). The red horizontal line represents the genome-wide significance threshold (P < 5e–8). c Overlayed Locus Zoom plots of chromosome 19q13.33: 324 SNPs were associated with individual HMOs. The most significant SNP (rs492602, P = 2.38e–118) is indicated by a dark purple dot, which is in LD (r2 = 0.99) with the known stop-gain variant rs601338 in the FUT2 gene. The x-axis shows genes mapped to this associated genomic region (250 kb) and y-axis indicates the significance of SNP associations (-log10(P)). d Metabolic flux from LNT and LNnT to fucosylated or sialylated pentasaccharides and corresponding HMO concentrations associated with rs601338 in the FUT2 gene. The illustrated metabolic pathway shows that this SNP is associated with almost all HMOs, not just the ones that are alpha1-2-fucosylated (e.g. 2’FL or LNFP1). The green oval highlights the alpha1-2-linked fucose. Boxplots show select HMO concentrations associated with rs601338, supporting the illustrated synthesis pathways (N = 980). Box minimum: Q1, box maximum: Q3, box center: median, whiskers (farthest points that are not outliers (i.e., within 3/2 times of interquartile range). e Principal Co-ordinate Analysis (PCoA) plot of overall HMO profiles by SNP rs601338 genotypes using a Bray-Curtis distance matrix. Subjects by genotype: GG, N = 386; GA, N = 405; AA, N = 190. Each dot represents the entire HMO profile of an individual mother. Variations along the primary axis, accounting for 56.6% of overall HMO concentrations, were strongly associated with the stop-gain variant rs601338 in the FUT2 gene.