Table 4 A detailed comparison of the contents of the RnaBench dataset (Inverse RNA Folding Dataset) with our dataset. If the number of different samples for a given ID is greater than one, the number is provided in parentheses next to the ID.

From: Comprehensive datasets for RNA design, machine learning, and beyond

Compared datasets

RFAM/PDB id

Avg. structure length (std. dev.)

No. of shared samples

RnaBench vs Our Dataset (samples extracted from the Rfam database)

RF00001, RF00005 (2), RF00007, RF00014, RF00019, RF00020, RF00021, RF00026, RF00029, RF00037, RF00043, RF00047, RF00053, RF00056, RF00090, RF00103, RF00167, RF00231, RF00237, RF00322, RF00400, RF00404, RF00406, RF00413, RF00422, RF00424, RF00446, RF00451, RF00545, RF00553, RF00565, RF00568, RF00582, RF00617, RF00641, RF00657, RF00667, RF00679, RF00906, RF00951, RF01225, RF01234, RF01241, RF01418, RF01751, RF01782, RF01797, RF01844, RF02030, RF02097, RF02635, RF02689, RF02723, RF02736, RF02737, RF02741, RF02742, RF02749, RF02755

87.4

(72.3)

60

RnaBench vs Our Dataset (samples extracted from the RNAsolo database)

1JOX_1_A, 1R2P_9_A, 1U3K_7_A, 7UW1_1_B

51.75

(36.47)

4