Extended Data Fig. 2: Reference genome similarity in simulated ancient microbial data. | Nature

Extended Data Fig. 2: Reference genome similarity in simulated ancient microbial data.

From: The spatiotemporal distribution of human pathogens in ancient Eurasia

Extended Data Fig. 2

a, Illustration showing phylogenetic context and expected average nucleotide identity (ANI) for a hypothetical sampled microbial species X and four genomes (A1, A2; B1, B2) of two genera (A, B) present in the reference database. b, Number of unique k-mers classified at the level of genus using KrakenUniq for replicates of different read numbers across all simulated species. Dashed line indicates cutoff used in analysis of real data (150 unique k-mers). c, Number of unique k-mers classified at the level of species as a function of average nucleotide identity for mappings against all individual species reference genomes in the genus of reads simulated for a particular species. Blue diamonds indicate results for the mapping against a reference genome from the same species as the simulated read data, whereas grey circles indicate reference genomes of other species. Selected individual species results are highlighted by species name. Dashed line indicates ANI ≥ 0.97 cutoff value. d, Barplots showing number of replicates where the true positive species reference genome was highest ranking in numbers of unique k-mers classified at level of species.

Back to article page