Table 1 Seq2PKS identifies the correct compound structure of 23 BGCs

From: Discovering type I cis-AT polyketides through computational mass spectrometry and genome mining with Seq2PKS

Compound name

BGC ID

Mass

Starter unit

Correct pathway rank

# of Core structures

Best Tanimoto similarity

4-Z-annimycin

BGC0001298

331.40

False

1/2

30

1.0

Abyssomicin C

BGC0000001

346.40

False

1/6

16

1.0

Chalcomycin A

BGC0000035

700.82

False

1/120

4

1.0

Chlorothricin

BGC0000036

941.46

False

1/120

128

1.0

Concanamycin

BGC0000040

692.90

False

1/720

16

1.0

Halstoctacosanolide A

BGC0000073

845.12

False

1/5040

42

1.0

Herboxidiene

BGC0001065

438.60

False

1/6

16

1.0

Lasalocid

BGC0000086

612.80

False

1/5040

2

1.0

Methymycin

BGC0000094

469.62

False

1/24

2

1.0

Nystatin

BGC0000115

926.10

False

1/720

80

1.0

Pimaricin

BGC0000125

665.70

False

4/120

32

1.0

Pladienolide B

BGC0000126

536.70

False

1/24

15

1.0

Salinomycin

BGC0000144

751.00

False

1/362,880

2

1.0

Spinosad A

BGC0000148

731.97

False

1/120

15

1.0

Streptoseomycin

BGC0001784

599.60

False

1/1

26

1.0

Tetrocarcin A

BGC0000162

782.90

False

22/120

128

1.0

Ansamitocin P-3

BGC0000020

635.15

True

1/24

8

0.69

Aureothin

BGC0000024

397.43

True

1/6

2

0.58

Mycinamycin

BGC0000102

727.89

True

1/120

8

0.56

Phenylnannolone A

BGC0000122

278.35

True

1/1

4

0.74

Soraphen A

BGC0000147

520.70

True

1/2

2

0.46

Spinosad A

BGC0000148

731.97

True

1/120

15

0.86

Vicenistatin

BGC0000167

500.70

True

2/24

30

0.72

  1. For compounds with uncommon starter unit, Seq2PKS can recover the compound structure except for the starter unit.
  2. The column # of Core Structures shows how many theoretical core structures are being generated by Seq2PKS. The top compounds do not have starter units, while the bottom ones have.