Fig. 8: Scorpio Architecture and Hierarchical Sampling Strategy. | Communications Biology

Fig. 8: Scorpio Architecture and Hierarchical Sampling Strategy.

From: Enhancing nucleotide sequence representations in genomic analysis with contrastive optimization

Fig. 8

a Model Architecture: Each branch transforms 768 (or 4096)-dimensional encoded embeddings to a 256-dimensional triplet vector. We have three types of encoders: BigBird embedding vectors, 6-mer Freq., and a model where BigBird is used with the embedding layer. b The diagram illustrates the hierarchical selection process for a positive and negative for an anchor in our gene-taxa dataset. First, the level of similarity is randomly determined (e.g., Ge: Gene, P: Phylum, C: Class, O: Order, F: Family, G: Genus). Positive samples match the anchor at this level, while negative samples are chosen from one level up to ensure dissimilarity.

Back to article page