Fig. 2: Determining the impact of CHM13v2 host genome mapping for optimizing microbial read quantification in control samples.

a Percentage of reads remaining after mapping to the human reference genome (i.e., % unmapped reads) using the cfSPI workflow. Reads were mapped either to human reference genome GRCh38.p14, CHM13v2, or a combination of these two. b Fractional abundance of fungi (kingdom) kraken2 classified reads (CT = 0, kraken2’s default), after subtracting host reads via reference genome mapping, normalized to the old version of the human genome assembly (GRCh38.p14). Fractional abundance of c. human (species) and d fungi (kingdom) classified reads when utilizing the ‘CHM13v2-containing uR.7’ or ‘uR.7’ database for kraken2 taxonomic classification (CT = 0, kraken2’s default) after dual-mapping to the host genome, normalized to ‘uR.7’. In a–d, each data point represents one control sample (9 BAL; 9 plasma), with colors indicating the sample type. Mean values are denoted as ‘mu’. Statistical analysis included one-tailed paired t-tests with Bonferroni correction (****, p ≤ 0.0001).