Fig. 1: Overview of the tree update process and PhyloTune methodology.

a Compared to the standard pipeline, PhyloTune introduces an innovative framework tailored to constrain updates within a specified subtree. By precisely identifying potentially informative regions within the subtree sequences, PhyloTune reduces the number and length of input sequences for alignment (e.g., MAFFT) and tree inference tools (e.g., RAxML), thereby improving tree update efficiency. b Overview of PhyloTune. For a given phylogenetic tree requiring updates, hierarchical linear probes (HLPs) are specifically designed to align with its taxonomic hierarchy. These probes are fine-tuned on a pre-trained DNA model to accurately classify query sequences at the smallest taxonomic unit within the specified tree while extracting high-attention regions from all sequences within the corresponding clade. c The PhyloTune model architecture tailored for the Plant dataset. It integrates a Transformer-based BERT module inherited from DNABERT and incorporates HLPs covering four taxonomic ranks: class, order, family, and genus.