Extended Data Fig. 4: UMI simulations.
From: Locus-specific expression of transposable elements in single cells with CELLO-seq

(a) Distribution of Levenshtein distance between randomly simulated UMI (x-axis) based on UMI length with RYN pattern (left) or NNN pattern (right). Light grey bar shows distance threshold for grouping of reads by UMIs used for most short read UMIs or CELLO-seq. (b) Line graph of fraction of pure groups (y-axis) by Levenshtein distance (x-axis) by UMI group, either with perfect read identity or ONT read identity. On the left is the line graph of UMI simulations without any pregrouping by mapping. On the right the line graph is UMI simulation where pregrouping was performed by random assignment of true UMI sequences into groups of 100 unique UMIs. (c) distribution plot of UMI group sizes (x-axis) by Levenshtein distance threshold (y-axis) based on UMI length, with perfect ONT read identity and no pregrouping (left) or pregrouping (right).