Fig. 1: Error correction of ONT simplex reads. | Nature

Fig. 1: Error correction of ONT simplex reads.

From: Efficient near-telomere-to-telomere assembly of nanopore simplex reads

Fig. 1

a, Error correction in existing hifiasm for PacBio HiFi reads. Hifiasm identifies informative sites in which each allele (represented in red or blue) is supported by multiple reads. Sequencing errors are represented by purple dots. The algorithm then corrects the target read (black) using supporting reads (orange) that match the target read across all informative sites. b, Recurrent sequencing errors in ONT simplex reads. The existing error-correction approach in hifiasm incorrectly identifies recurrent sequencing errors as informative sites (illustrated as purple dots) because it is supported by two reads. In this case, all blue reads are correctly excluded owing to real informative sites (x and y), whereas all orange reads are mistakenly discarded because of the false-positive informative site z. With the existing error-correction approach, no reads remain available for correction. c, Error correction in hifiasm (ONT) with two haplotypes. The site vectors corresponding to informative sites x, y, m and n (highlighted in red) are identified as mutually compatible. These sites can be grouped into the same cluster using the dynamic programming matrix. Sites resulting from sequencing errors, such as z and t, are incompatible with other sites and remain unclustered. d, Error correction in hifiasm (ONT) with more than two haplotypes or repeat copies. The target read (black) and the orange reads originate from haplotype or repeat copy 1, the blue reads from haplotype or repeat copy 2 and the yellow reads from haplotype or repeat copy 3. Using the dynamic programming matrix, sites x and m can be grouped into one cluster (highlighted in yellow), whereas sites y and n form another cluster (highlighted in blue). Sites x and m exclude reads from haplotype or repeat copy 3, whereas sites y and n exclude reads from haplotype or repeat copy 2.

Back to article page