Figure 1
From: Epigraph: A Vaccine Design Tool Applied to an HIV Therapeutic Vaccine and a Pan-Filovirus Vaccine

(a)Full graph for the CRF01-AE clade of the Nef protein. The green rectangle is an inset shown in (B). Nodes are red dots and represent each k-mer variant, with k = 9. The edges are thin blue lines that connect epitopes whose sequences overlap by k − 1 amino acids, as shown for the first two epitopes (ea = VTSSNMNNA, eb = TSSNMNNAD) in the upper left of (B). Although the topological properties of the graph do not depend on the node positions, this plot uses the vertical axis to indicate epitope frequency in the target sequence set, y = f(e), for each node. The horizontal position of the nodes is chosen so that all directed edges connect from left to right. The ideal path through this graph keeps as much as possible to the largest y-values; this path defines a protein sequence that maximizes epitope coverage of the population. (b) The inset shows two paths through the nodes. The solid black line is the optimal path and corresponds to the sequence VTSSNMNNADSVWLRAQEEEE while the dashed green corresponds to VTSSNMNNADCVWLRAQEEEE. The dashed line achieves higher f(e) values on 4 nodes, but the solid line has higher f(e) for 5 nodes and ∑ f(e) is higher. Note there is no path that includes the highest-valued nodes for all horizontal positions.