Extended Data Fig. 10: Validation of the MAG assembly pipeline.
From: Bioactive glycans in a microbiome-directed food for children with malnutrition

(a) Bioinformatic workflow for comparing the fidelity of MAG assembly from alignment-based (bowtie2) versus pseudoalignment-based (kallisto) contig quantitation. (b) A detailed description of the workflow is described in panel a. Each box includes a summary of the computational task plus the name of the program and, where relevant, the command used to complete the task (in parentheses). Colour of the text in parentheses: brown, default code from kallisto; purple, default code from bowtie2; blue, custom script written to achieve tasks described; black, default code used for programs to assemble MAGs and dereplicate contigs. Boxes with a black outline and the thick black arrows emanating from them indicate that binned contigs were used as input to AMBER for MAG assembly comparisons across methods. (c) Boxplot describing summary statistics (completeness and purity) for MAG assembly approaches. Boxplots indicate the median, first and third quartiles; whiskers extend to the largest value no further than 1.5 × the interquartile range. The number of individual MAGs generated using each MAG assembly tool-quantitation method was 671 for DAS tool-kallisto, 723 for MaxBin2-kallisto, 925 for MetaBAT2-kallisto, 362 for CONCOCT-kallisto, 540 for DAS tool-bowtie2, 780 for MaxBin2-bowtie2, 578 for MetaBAT2-bowtie2 and 340 for CONCOCT-bowtie2. (d) Number of MAGs, obtained from each assembly and quantitation strategy, that are distributed across two quality metrics (completeness, contamination). The number of ‘gold standard’ MAGs (theoretical maximum) in the simulated metagenomic dataset is indicated by a horizontal dashed line. *, P < 0.05; **, P < 0.01; ***, P < 0.001 (two-tailed, unpaired Wilcoxon test for panel c; two-tailed Fisher’s exact test for panel d).