Fig. 2: Image-based encoding of MEI candidate supporting reads. | Nature Communications

Fig. 2: Image-based encoding of MEI candidate supporting reads.

From: Image-based DNA sequencing encoding for detecting low-mosaicism somatic mobile element insertions

Fig. 2: Image-based encoding of MEI candidate supporting reads.

a Supporting reads used to detect MEI include split-reads (SRs), paired-end reads (PEs) and clipped paired-end reads (clipped PEs). Blue indicates the segment of the supporting read that mapped to the flanking sequence, while red denotes the segment that mapped to the mobile element (ME) consensus sequence. For SRs, the mapped segment to the ME consensus must be greater than 30 bp, while clipped PEs require that the mapped region to the flanking sequence is higher than half of the read length to ensure mapping accuracy. b Two supporting reads from a candidate L1 insertion, each with two paired ends, are encoded into a three-channel image with nine tracks, integrating positional features (relative genome positions) and sequence-based features (read-ME alignment). The L1 consensus sequence is divided into the 5’-end (top) and the 3’-end (bottom), with a further zoom in to the 5’-end to illustrate the encoding syntaxes. Track 1 shows flanking sequence mappability (0–1, black =  fully mappable). Tracks 2, 4, 6, and 8 depict genome positions of each read end, and tracks 3, 5, 7, and 9 show alignment to L1Hs consensus; blue arrows indicate genome positions, red pixels indicate L1Hs alignment, and mismatches appear as coexisting black and red pixels. Read 1 is a PE read with end1 mapped in the human genome (blue arrow in track 2) and end2 mapped in L1 (red pixels in track 5). In track 5, the L1Hs consensus sequence is denoted by a matrix where columns represent the base positions and rows represent the nucleotides A, C, T, and G, from top to bottom. The read sequence aligning to L1Hs is highlighted in red pixels. A mismatch appears as a column with both black (L1Hs) and red (supporting read) pixels. Read 2 is a clipped PE read; the shorter blue arrow in track 8 indicates end2 partially maps to the genome, with the unmapped portion in black line and the portion mapped to L1Hs displayed as red pixels in track 9.

Back to article page