Figure 2: Validation of the draft genome assemblies. | Scientific Data

Figure 2: Validation of the draft genome assemblies.

From: Population genomic datasets describing the post-vaccine evolutionary epidemiology of Streptococcus pneumoniae

Figure 2

(a) Venn diagram showing the overlap between the sequence types from the original genotyping of the collection, those inferred from Illumina sequence read mapping, and those inferred from the genome assemblies. Only data for the 594 isolates for which all three datatypes were available are represented here. (b) Venn diagram showing the overlap between sequence types inferred by different methodologies, in this case treating results as being consistent if only one of the seven loci differed between results. In this case, the sequence types inferred from read mapping and de novo assembly are identical, and differ from the original genotyping in only twelve cases. (c) Histogram showing the number of CDSs in publicly available annotated complete, or high quality draft, S. pneumoniae genomes. (d) Histogram showing the number of CDSs in the 616 draft genomes from Massachusetts. This distribution shows that the count of putative CDSs in each draft genome is within the range of CDSs identified in manually annotated genomes, consistent with the draft genomes being near-complete, and the CDS predictions being accurate.

Back to article page