Table 2 First four rows: example SMILES strings from the indication-to-drug task; Last four rows: example MolT5 indication generations from the drug-to-indication task.

From: Emerging opportunities of using large language models for translation between drug molecules and indications

Input

Ground truth

Output

Validity/similarity

Indication-to-drug

 Diabetes mellitus

COCCOc1cnc(NS(=O)(=O)c2ccccc2)nc1

O=C([O-])CC(=O)[O-]

Valid

 Coronary artery disease

CCOC(=O)C(C)=O

CCCCC[C@H](O)CC=CCC=CCCCC(=O)O

Valid

 Respiratory system disease

CCC1(C)CC(=O)NC(=O)C1

[H+].C(=O)[O-])[O-]

Syntax Error

 Hemorrhage

CC1=CC(=O)c2ccccc2C1=O

C(=O)C(=O)O)O.O)O.O)O.O.O

Syntax Error

Drug-to-indication

 CN(C)CCOC(c1ccccc1)c1ccccc1

Allergic disease ... cancer ... eczema ...

... and cancer ...

0.2206

 CCc1cc(C(N)=S)ccn1

Multidrug-resistant tuberculosis osteomyelitis ...

... cancer

0.2262

 O=C([O-])c1ccccc1.[Na+]

Encephalopathy psychosis

Inamideamide protein protein proteinamide.

0.0183

 Clc1ccccc1CN1CCc2sccc2C1

Internal carotid artery stenosis ... Recurrent thrombophlebitis

Amideamideamide.

0.0316