Fig. 3: Molbit classification and tag decoding results. | Nature Communications

Fig. 3: Molbit classification and tag decoding results.

From: Rapid and robust assembly and decoding of molecular tags with DNA-based nanopore signatures

Fig. 3

a Violin plots with embedded box-and-whiskers showing the distribution of raw nanopore signal lengths for DNA sequences of length 400 nt (n = 354,000 reads) and 1600 nt (n = 375,000 reads). The white dot represents the median, and the thick center line shows the quartiles. The dashed vertical line shows the cutoff used when calling reads as 400 mers or 1600 mers. b Correlation of read counts for each molbit, demonstrating consistency in molbit occurrences between training and testing runs (two-sided Pearson correlation, p = 3.1 × 10−26, r = 0.84). Counts were first normalized within each run and normalized again after combining runs for either training or testing. c Tag decoding workflow, with error correcting codes (ECC). After acquiring nanopore current traces from a standard sequencing run, the molbit in each trace is identified using the CNN (confidence ≥ 0.9). Successfully identified molbits are accumulated and converted into binary using a threshold for presence. This threshold is varied as error correction is carried out multiple times, accepting the binary digital tag that has the fewest differences from the received codeword. d Chance of incorrect tag decoding as a function of the bit corruption rate and number of data bits. This chance increases exponentially as the corruption rate and number of data bits increase linearly. The dashed line represents the goal of 1 in 1 billion tags incorrect, and the “+” marks Porcupine’s chance of incorrect decoding. Source data are provided as a Source Data file.

Back to article page