Fig. 3 | Scientific Reports

Fig. 3

From: Dark and camouflaged genomic regions remain challenging in CHM13

Fig. 3

The T2T-CHM13 genome contained more dark and camouflaged genes than HG38; ONT outperformed other platforms. (a) 8,077 gene bodies contained dark regions in CHM13 for Illumina100 while ONT samples only had 286 dark gene bodies. (b) 6,059 gene bodies contained dark-by-MAPQ regions in CHM13 for Illumina100 while ONT samples only had 180 dark-by-MAPQ gene bodies. (c) 3,709 gene bodies contained dark-by-depth regions in CHM13 for Illumina100 while ONT samples only had 107 dark-by-MAPQ gene bodies. (d) 1,354 gene bodies contained both dark-by-depth and dark-by-MAPQ regions in CHM13 for Illumina100 while ONT samples only had 4 gene bodies containing both dark-by-depth and dark-by-MAPQ regions. (e) CTAG1A/B contained both dark-by-depth and dark-by-MAPQ within a larger dark region. (f) IL3RA is 41.5% dark (in Illumina100) made up of both types of dark regions. (g) CHM13 had almost double the number of at least 5% dark genes compared to HG38 without alternate contigs. (h) While CHM13 had far more genes with at least 5% dark-by-MAPQ, ONT resolved 95%. (i) Genes that were at least 5% dark-by-depth were an order of magnitude less of an issue (see Y-axis) than dark-by-MAPQ, but ONT still performed the best across the board. (j) HG38 with alternate contigs had more gene bodies with at least 5% dark by both types. (k) CHM13 generally had the most genes with at least 5% camouflaged, the vast majority of which were resolved with long reads. (l) CHM13 generally had the most 100% camouflaged genes.

Back to article page