Figure 1
From: Robust and scalable barcoding for massively parallel long-read sequencing

NS-watermark barcodes for long-read sequencing technologies. Each of a set of 4096 IDs is expressed as sequence of \(k=3\) symbols from a hexadecimal alphabet \(\mathscr {A}\) (x, blue). A hexadecimal LDPC code systematically adds \(m=3\) redundant symbols (grey) to x to generate an LDPC codeword c of length \(n=6\). In the example shown, x = 4AF maps to c = 131:4AF. A dictionary \(\varepsilon \) (right) translates symbols from \(\mathscr {A}\) into unique short nucleotide sequences of size \(u=6\), thereby converting c into a nucleotide sequence of size \(l=n \times u=36\). The Galois field addition \(\oplus \) of an optional “watermark” sequence of size l produces the final target barcode sequence (cyan). The operator \(\oplus \) is defined on the table on the bottom left.