Table 2 Benchmarking results of de novo and genome-guided transcriptome assembly for plant RiPP precursor discovery

From: Large-scale transcriptome mining enables macrocyclic diversification and improved bioactivity of the stephanotic acid scaffold

Assembler

Approach

Total time [min]

Total memory usage [GB]

Total detected cores (98 total cores)

Detected unique correct cores (38 total unique cores)

Detected repetitive cores (87 total repetitive cores)

Similarity [sum %] (total similarity score: 2000)

Identity [sum %] (total identity score: 2000)

MEGAHIT (v1.2.9)

De novo

940 ± 44

123 ± 1

42 ± 0

24 ± 0

32 ± 0

1582 ± 0

1587 ± 0

SPAdes (v3.15.5)

De novo

1185 ± 39

180 ± 1

66 ± 0

32 ± 0

56 ± 0

1728 ± 0

1713 ± 0

Trinity (v2.15.1)

De novo

13683 ± 621

485 ± 6

41 ± 0

25 ± 0

30 ± 1

1527 ± 2

1521 ± 2

StringTie (v2.2.1)

Genome-guided

922 ± 621

1635 ± 2

77 ± 0

27 ± 0

65 ± 0

1565 ± 0

1545 ± 0

Trinity (v2.15.1)

Genome-guided

10545 ± 417

1787 ± 4

62 ± 0

28 ± 0

51 ± 0

1795 ± 1

1691 ± 1

  1. 16 RNA-seq datasets with corresponding annotated genomes were trimmed and assembled which include 20 precursor genes with 98 core peptides and 38 unique core peptides. Each RNA-seq dataset was assembled in triplicate (n = 3). Displayed values are means with one standard deviation shown as an error. Source data are provided as a Source Data file.