Generative models in cheminformatics depend on molecules being representable as structured data, such as the simplified molecular-input line-entry system (SMILES). Mokaya and colleagues investigated how the choice of representation influences the quality of generated compounds, and found that string-based representations can hinder performance in a curriculum learning setting.
- Maranga Mokaya
- Fergus Imrie
- Charlotte M. Deane