Fig. 2: AF2’s multiple sequence alignment (MSA) clustering heuristic. | Nature Communications

Fig. 2: AF2’s multiple sequence alignment (MSA) clustering heuristic.

From: High-throughput prediction of protein conformational distributions with subsampled AlphaFold2

Fig. 2

A An MSA of arbitrary length is built from a target sequence and passed to AF2, which randomly selects a number of sequences (defined by max_seq) from the input MSA. Each of the selected sequences becomes a cluster center around which the sequences not selected in the previous step are distributed. The target sequence is always selected as a cluster center. The clusters obtained through this process are featurized and relevant statistics are calculated. B All of the previously discussed elements are used by AF2 for inference. Cluster features and a number of random non-cluster-center sequences (defined by extra_seq) are processed and passed to the Main Evoformer Stack, while the MSA containing the cluster centers is processed, passed to the comparatively expensive row/col attention track, and then finally passed to the Main Evoformer Stack as well. Figure Created with BioRender.com.

Back to article page