Extended Data Fig. 4: Genome-wide distribution of contig ends representing potential assembly gaps in human genome assemblies generated by hifiasm.
From: Efficient near-telomere-to-telomere assembly of nanopore simplex reads

Contig ends correspond either to chromosome ends or to unresolved assembly gaps. We aligned contig ends from each assembly to either the HG002 or CHM13 T2T reference genome. For the HG002-specific results (a–c), evaluations were performed using the HG002 Q100 ground truth. For the results in d,e, assemblies were aligned to the CHM13 T2T reference genome. a–c, Number of contig ends in the HG002 phased assemblies produced by hifiasm using PacBio HiFi (a), ONT standard (b) and ONT ultra-long (c) reads. Each contig end was hierarchically classified according to its overlap with annotated sequence features. Poisson breaks refer to isolated contig ends (≤2 ends within a ± 100 kb window), which are likely to represent random assembly breaks. d,e, Genome-wide distribution of assembly gaps across 10 diploid human genome assemblies (20 haplotypes) generated using ONT standard simplex (d) and PacBio HiFi (e) reads, including both trio and non-trio assemblies. Chromosomal bars show haplotype-level gap coverage as a red heat map, indicating the number of haplotypes with gaps at each position (max: 20 for autosomes, sex-aware for chrX/Y). Black bars above represent the number of diploid genomes with assembly gaps at each position (max: 10, after diploid-genome-level deduplication). Annotations below indicate segmental duplications (blue), centromeric satellites (yellow), and overlaps (green). More details are provided in Supplementary Information sections 1.13 and 1.14.