Fig. 1: Glycomics data are compositional. | Nature Communications

Fig. 1: Glycomics data are compositional.

From: Compositional data analysis enables statistical rigor in comparative glycomics

Fig. 1

a Relative to absolute abundances (top), relative abundances (middle) distort relationships between glycans, which can be rescued via compositional data analysis (bottom). Transformed abundances indicate CLR-transformed data with an informed-scale model. b Ion intensities (absolute values), relative abundances (%), and ALR-transformed abundances of glycan standards (Supplementary Data 1). Control standards (gray) were added as 2 pmol/µL in all samples, whereas standard 3/4 were added in increasing concentrations (colored). Data are shown as bar graphs with standard deviation (centered at sample mean), with data points (n = 3 independent mixtures) overlayed as a scatterplot. Significance was established via two-tailed Welch’s t-tests. All comparisons without a star did not reach the significance threshold (<0.05). c False-positive rate (FDR) for identifying spurious differences between conditions increases with sample size when ignoring the compositional nature of glycomics data. Glycomics data were simulated via Dirichlet distributions with defined effects (see “Methods”). For each sample size, 10 independent simulations were performed (n = 10) and results are shown as means with standard deviation as error bars. The Benjamini–Hochberg procedure was used for FDR control, except for ALR (our herein presented workflow, which used the two-stage Benjamini–Hochberg procedure). d Compositional data analysis maintains excellent sensitivity for identifying differences between conditions. Simulations as in (c), with the difference of tracking sensitivity instead of type I error. e Overview of CoDA-improved workflow for differential glycomics abundance analysis. While the schema most closely resembles the differential expression workflow (get_differential_expression), indicated steps were largely preserved in all analyses presented here and in glycowork. ALR was considered to have failed if the Procrustes correlation of the best reference glycan was below 0.9 or if its log2-transformed variance was above 0.1. f Implementing CLR- and ALR-transformations for glycomics data. ALR depicts the successful choice of a reference glycan, which fulfills the desired criteria. Scale uncertainty can further be introduced into ALR by subtracting log2(N(0, γ)) from the reference. Both depictions describe the usage with an uninformed scale model (i.e., scale uncertainty). *p < 0.05, **p < 0.01, ***p < 0.001. Source data are provided as a Source Data file.

Back to article page