Figure 3 | Scientific Reports

Figure 3

From: Nanopore sequencing technology: a new route for the fast detection of unauthorized GMO

Figure 3

Workflow used for the annotation of the clusters. To annotate the clustered reads, the representative sequences of each cluster were blasted against both the green plants division of the NCBI Reference genomic sequences database (“refseq_genomic_green_plants”) and the NCBI Nucleotide collection database, excluding the Oryza sequences (“nt-rice”), using default blast parameters and a word size of 64. Initially, the sequences were subdivided as (class 1) sequences having a long hit (98% of the query length or more) against refseq_genomic_green_plants, (class 2) sequences having an intermediate-length hit (less than 98% of query length) against refseq_genomic_green_plants, further subdivided into (class 2A) sequences giving hits against the database due to similarity to functional elements such as promoters and coding regions from genomes of wild-type and transgenic plants and (class 2B) sequences having hits to the database due to hits against plant genomic sequences that did not encode functional genetic elements, and (class 3) sequences having no hits against refseq_genomic_green_plants. While class 1 was considered as non-informative and was not further processed, classes 2A, 2B and 3 were grouped according to their best hit against the nt-rice database: 2A and 3 sequences having full length or partial hits to the non-rice sequences were considered as potentially being obtained from internal parts of the transgenic insert, while sequences from the class 2B that aligned to the non-rice sequences were categorized as possibly containing the junction sequences of the transgenes. Sequences from class 2A having no hits against nt-rice could not provide any information. The numbers in brackets correspond respectively to the number of clusters with the total of processed reads associated.

Back to article page