Fig. 5: Visual analyses of PhyloTune based on nine molecular markers of the Plant test dataset.

a Comparison of the average AUROC for four taxonomic ranks of different markers in novelty detection (n = 15000 sequences for each rank). b–d Comparison of the average macro precision, macro recall and macro F1-score for four taxonomic ranks of different markers in taxonomic classification (n = 14700, 14400, 13200, 10000 sequences for class, order, family and genus). e Pearson correlation coefficient between average attention, heterozygosity, the fixation index (FST), absolute divergence (DXY), and substitution rate based on nine molecular markers. f Attention heatmap of the chloroplast marker matK using PhyloTune (n = 1000 sequences). The red box highlights the attention peak region for the majority of sequences, with an example of the corresponding DNA sequences displayed above it. g Average attention, heterozygosity, substitution rate, FST, and DXY curves of matK. The blue shaded area denotes the peak region of attention. Source data are provided as a Source Data file.