Fig. 1: Development of rhizoSMASH. | Nature Communications

Fig. 1: Development of rhizoSMASH.

From: Predicting rhizosphere-competence-related catabolic gene clusters in plant-associated bacteria with rhizoSMASH

Fig. 1

a The gene cluster prediction workflow of rhizoSMASH. RhizoSMASH takes a genome sequence file as input (GenBank or FASTA) and recognize potential catabolic enzymes by scanning the sequence profile hidden Markov models. Gene clusters encoding relevant pathways were then detected using a set of detection rules. b The tuning procedure used for curation of rCGC detection rules. An initial set of detection rules was first summarized from a comprehensive literature study. Then, genome sequences in our BARS collection were scanned using this set of detection rules. The output gene clusters were grouped into cluster families with BiG-SCAPE together with our known cluster database, rKnownCGCs. We manually curated the detection rules by visually investigating the gene cluster family network generated by BiG-SCAPE for putative false positives/negatives, aided by further literature searches when needed. This calibration, validation and finetuning was performed three times to arrive at more and more optimal detection rules. Created with BioRender.com.

Back to article page