Figure 5 | Scientific Reports

Figure 5

From: Generator based approach to analyze mutations in genomic datasets

Figure 5

Mutation Region Detection. We use the synthetic experiments to show how our method works. (a) shows the process of the synthetic experiments. In the experiment, we first generate the original random RNA sequence of length L/nt, then we set the mutation probability of each nucleotide site on the original sequence to p to generate 2N mutated sequences. We assume that three common mutation types (i.e., substitution, insertion and deletion) occur with equal probability. Then we divide these 2N sequences equally into 2 groups, in the first group, we do not perform any processing, while in the second group, we randomly add the same new mutations to them (i.e., add the same mutation on the same nucleotide site). Then we use our mutation region detection algorithm (set the region length to l/nt) to detect the regions that have mutations, and apply the alignment algorithm on these regions to find the specific mutation. (b-e) We also take the examples with 4 different combinations of sequence length L and region length l: (b) \(L=1000, l=100\); (c) \(L=10000, l=1000\); (d) \(L=1000, l=50\); (e) \(L=10000, l=500\). We set the mutation probability p to \(5\%\), and the number of sequences in each group to 100, then we randomly add 5 mutations on the sequences in the second group, and detect the mutation region by using our mutation region detection algorithm.

Back to article page