Figure 3
From: Interpretable genotype-to-phenotype classifiers with performance guarantees

Rule-based genotype-to-phenotype classifiers. Each classifier is shown as a hierarchical arrangement of rules (boxes) and leaves (rings). Predictions are made by placing a genome at the root and branching left or right based on the outcome of the rules until a leaf is reached. A leaf predicts the most abundant class among the training examples that were guided into it. The number of such examples is shown in its center and the ring is colored according to the distribution of their phenotypes (classes). Each rule detects the presence/absence of a k-mer and was colored according to the genomic locus at which it was found. Additionally, each box contains a measure of rule importance and the number of rules that were found to be equivalent during training. Finally, the logical expression leading to the prediction of the resistant phenotype is shown, under each model, to contrast the structure of models learned by each algorithm. In these expressions, the rules are shown as squares with colors that match those of the models and the negation of a rule is used to indicate the “absence“ of the k-mer. In both cases, the CART models are disjunctions of conjunctions (ORs of ANDs) and the SCM models are disjunctions (ORs).