Fig. 2: Protein glycosylation composition and microheterogeneity in the mouse brain.

a, Frequency of the number of unique glycan compositions per site in the PGC-fractionated mouse brain N-glycosylation data (on average, 18 glycan compositions were identified per site). b, Total number of unique glycopeptides, glycosites and glycoproteins per glycan class for the PGC-fractionated mouse brain sample. c, Frequencies of glycan classes identified in this study and in a recent glycoproteomics study using lectin-based enrichment16. d, Distinct glycan classes are significantly overrepresented or underrepresented on specific functional protein domains from the InterPro database (log2 odds ratio > 0 or log2 odds ratio < 0, Fisher’s exact test, adjusted P value < 1 × 10−3; examples shown). e, Glycosylation profiles of sites belonging to the same protein are significantly more correlated compared to when glycan compositions are shuffled across the whole glycoproteome (Kendall rank correlation coefficient for n = 29,932 site pairs per distribution, two-sided Wilcoxon rank-sum test, P < 2.2 × 10−16). Box plots indicate the median and the first and third quartiles. Whiskers extend from the hinges to the largest value no further than 1.5 × the interquartile range. Data points beyond the end of the whiskers are plotted individually. f, Results of the GO enrichment analysis (STRINGdb) of proteins with sites displaying low, medium or high microheterogeneity (n = 1–2, 3–11 and 11+ unique glycan compositions per site, respectively) for selected significant terms (two-sided Fisher’s exact test, FDR < 0.05). g, Frequency of glycosites or glycopeptides with a close known phosphosite (±5 residues) per glycan class for O-glycosylation and N-glycosylation data.