Fig. 1: Pre-training of Self-GenomeNet and using the learned weights on a down-stream task.
From: A self-supervised deep learning method for data-efficient training in genomics

a Self-GenomeNet takes part of a sequence as input and predicts the reverse-complement of the remaining sequence. The representations are learned by dividing unlabeled DNA sequences and their reverse-complements into patches, each of which is given as an input to an encoder network \({f}_{{{{{{\rm{\theta }}}}}}}\). The outputs of \({f}_{{{{{{\rm{\theta }}}}}}}\) are then fed sequentially to a recurrent context network \({C}_{{{{{{\rm{\phi }}}}}}}\), resulting in representations of the input sequence up to a point \(t\) (\({S}_{1:t}\)) and representations of the reverse-complement of the input sequence going from \(\left(t+1\right)\) to the end (i.e., \({\bar{S}}_{N:t+1}\)). The representations are computed for multiple values of \(t\) simultaneously. Finally, the representations of \({S}_{1:t}\) (\({z}_{i}\)) and \({\bar{S}}_{N:t+1}\) (\({z}_{\left(n-1-i\right)}^{{\prime} }\)) predict each other for multiple values of \(t\) by using a contrastive loss, i.e, these sequences are matched among existing sequences in the training batch. Thus, in one iteration of the training of Self-GenomeNet, each of the computed representations \({z}_{i}\) and \({z}_{\left(n-1-i\right)}^{{\prime} }\) are utilized efficiently since \({z}_{i}\) predicts \({z}_{\left(n-1-i\right)}^{{\prime} }\) and \({z}_{\left(n-1-i\right)}^{{\prime} }\) predicts \({z}_{i}\) for \(i\in \left({{{{\mathrm{1,2}}}}},..,n-2\right)\) in one iteration of training. In the figure, we only show that \({z}_{2}\) predicts \({z}_{\left(n-3\right)}^{{\prime} }\) for visual simplification. b The weights of \({f}_{{{{{{\rm{\theta }}}}}}}\) and \({C}_{{{{{{\rm{\phi }}}}}}}\) are initialized with the training results from the self-supervised learning task, but they are further trained (fine-tuned), along with the linear layer on the new supervised task.