Fig. 2: The effect of the target length, the MSA depth, recycling, and metagenome MSAs on the modeling accuracy.

a–c 36 targets in the test set (cyan circles) and 48 targets in the validation set (orange triangles) are plotted. a RMSD (Å) relative to the length of query RNA sequence and RMSD. b RMSD (Å) relative to the raw count of sequences in MSAs. c RMSD (Å) relative to the number of effective sequences (Nf) in MSAs. d The average RMSD of generated structures relative to the number of recycles (solid lines, left axis, Å) and pLDDT (dashed lines, right axis). The testing data and validation data were shown in cyan and orange, respectively. e The prediction performance comparison between the Baseline model, which utilized three times recycling, and the best structure for each target in all 30 recycle. The performance was measured by RMSD (Å). f The prediction performance comparison between the Baseline model and the best pLDDT model from 11 ± 3 recycles. The performance was measured by RMSD (Å). g The depth of MSAs by the metagenome database search relative to the original MSAs. Meta Only, the MSA from the metagenome search; Concat. is a simple concatenation of the metagenome MSA and the original MSA; Filtered, redundant sequences are removed from the Concat MSA. h The performance comparison between the baseline and the selected structure based on the pLDDT score. The source of the MSA of the selected structure is indicated by different symbols. i Performance changes by differences in MSA depth. A positive Δ MSA indicates that the selected MSA contains more sequences than the baseline MSA.