Fig. 1: Neural ADMIXTURE model architecture. | Nature Computational Science

Fig. 1: Neural ADMIXTURE model architecture.

From: Neural ADMIXTURE for rapid genomic clustering

Fig. 1

a, Single-head architecture. The input sequence (x) is projected into 64 dimensions using a linear layer (θ1) and processed by a GELU non-linearity (σ1). The cluster assignment estimates Q are computed by feeding the 64-dimensional sequence to a K-neuron layer (parametrized by θ2) activated with a softmax (σ2). Finally, the decoder outputs a reconstruction of the input (\(\tilde{x}\)) using a linear layer with weights F. Note that the decoder is restricted to this linear architecture to ensure interpretability. b, Simple multi-head example with H = 3. The 64-dimensional hidden vector is copied and processed independently by different sets of weights (\({\theta }_{{2}_{h}}\)), which yield vectors of different dimensions, corresponding to the different K values. Each different \({Q}_{{K}_{h}}\) matrix is processed independently by different decoder matrices \({F}_{{K}_{h}}\) yielding H different reconstructions. All parameters are optimized jointly in an end-to-end fashion.

Back to article page