Table 1 Dataset sources list in the database.

From: SOMAS: a platform for data-driven material discovery in redox flow battery development

Database

Final size

Compound Representation

Average molecular weight

1

1,068

Name, partial CASRN, SMILES

167.09

2

2,122

Name, partial CASRN,

216.23

3

149

Name, CASRN,

185.27

4

2,791

Partial Name, SMILES, partial CASRN

257.19

5

1,743

Name, CASRN

248.66

6

3,823

Name, InChIKey

266.28