Fig. 7: BioPKS pipeline successfully suggests integrated biosynthetic pathways to 93 biomanufacturing candidates.

a When BioPKS pipeline is prompted to synthesize the validation set of 155 molecules that would be potential candidates for biomanufacturing, it achieves an overall hit-rate of 60% (i.e., 93/155). Of the 93 molecules that can be synthesized exactly, 3 are synthesized solely by PKSs while 46 more molecules can be synthesized with just one post-PKS tailoring step from the PKS product. After a second post-PKS step, 44 more molecules can be synthesized, bringing the total number of molecules synthesized by BioPKS pipeline to 93. b Although the remaining 62 molecules out of 155 cannot be synthesized by BioPKS pipeline, post-PKS modification steps are shown to improve the chemical similarity between the PKS product and the post-PKS product with respect to the target for these 62 target molecules that could not be synthesized. The box plots encompassed within the violin plots here reflect the distribution (n = 62) of chemical similarity scores of the final product that is most chemically similar to the target. The white bar in each box plot reflects the median chemical similarity. Meanwhile, the lower and upper bounds of each box plot represent the 25th and 75th percentiles of the distribution of chemical similarities, respectively. The lower and upper whiskers represent the minimum and maximum chemical similarity, respectively. Across all of these statistical metrics, the chemical similarity of the final product reached with respect to its corresponding target is observed to increase with each post-PKS modification. Source data are provided as a Source Data file.