Fig. 1: Scaling EVcouplings methods to full bacterial genomes.

A The search problem for binary protein-protein interactions in Escherichia coli involves finding all of the estimated 104 true interactions out of 107 possible pairs. Only approximately 9% of these true direct interactions have a crystal structure solved in E. coli or a homologous structure in another organism. B Evolutionary couplings learned directly from protein sequences can resolve interfaces. Sequence alignments of both monomeric proteins are created and concatenated by the reciprocal highest identity procedure before inference of evolutionary couplings. Raw evolutionary coupling scores can be combined with features of their distribution, biochemical properties, and sequence entropy to improve inference. C A benchmark dataset of all non-redundant protein interactions with known interface structure.