Abstract
We report on the integration of three methods that predict, on a proteome-wide scale, whether two proteins are likely to form a binary complex. The methods include PrePPI, which uses three-dimensional structure information as a basis for predictions, Topsy-Turvy, which uses a protein language model, and ZEPPI, which uses evolutionary information to evaluate protein-protein interfaces. Testing on the high-quality HINT database of binary PPIs reveals that the integrated method has better performance and identifies more high-confidence interactions than any of the component methods. The AF3Complex algorithm is used to predict the structures of 374 PPIs with a large fraction having at least partially overlapping interfaces with PrePPI models of the same complex. Clustering of the high-confidence E. coli interactome yields 385 subnetworks which have high functional coherence. Biological insights derived from the subnetworks, including the annotation of proteins of unknown function, are discussed in detail.
Similar content being viewed by others
Data availability
All predictions generated in this study, including genome-wide PPI predictions for human and E. coli using three different methods (PrePPI, ZEPPI, and D-Script-TT), as well as the integrated predictions derived from the Bayesian model, have been uploaded to Figshare [https://doi.org/10.6084/m9.figshare.31362145]. The PrePPI predictions can also be downloaded from the PrePPI website [https://honigcomplab.c2b2.columbia.edu/PrePPI]. Supplementary Tables S1, S2 are available. The source data underlying Supplementary Fig. S1 is provided as a Source Data file on Github repository [https://github.com/honig-lab/BayesianModel-for-Ecoli-PPI/tree/main/data].
Code availability
The code and tutorial for integrating PrePPI, ZEPPI, and D-Script-TT inputs through a Bayesian framework are available on the GitHub repository [https://github.com/honig-lab/BayesianModel-for-Ecoli-PPI] and from Zenodo [https://doi.org/10.5281/zenodo.18684873].
References
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Evans R, et al Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv. https://doi.org/10.1101/2021.10.04.463034 (2021).
Venkatesan, K. et al. An empirical framework for binary interactome mapping. Nat. Methods 6, 83–90 (2009).
Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).
Durham, J., Zhang, J., Humphreys, I. R., Pei, J. & Cong, Q. Recent advances in predicting and modeling protein-protein interactions. Trends Biochem Sci. 48, 527–538 (2023).
Cong, Q., Anishchenko, I., Ovchinnikov, S. & Baker, D. Protein interaction networks revealed by proteome coevolution. Science 365, 185–189 (2019).
Humphreys, I. R. et al. Computed structures of core eukaryotic protein complexes. Science 374, eabm4805 (2021).
Zhang, J. et al. Predicting protein-protein interactions in the human proteome. Science 390, eadt1630 (2025).
Petrey, D., Zhao, H., Trudeau, S. J., Murray, D. & Honig, B. PrePPI: A Structure Informed Proteome-wide Database of Protein-Protein Interactions. J. Mol. Biol. 435, 168052 (2023).
Garzon JI, et al A computational interactome and functional annotation for the human proteome. Elife. 5, https://doi.org/10.7554/eLife.18715 (2016).
Zhang, Q. C. et al. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490, 556–560 (2012).
Burley, S. K. et al. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive. Methods Mol. Biol. 1607, 627–641 (2016).
Lasso, G. et al. A structure-informed atlas of human-virus interactions. Cell 178, 1526–1541 (2019).
Broyde, J. et al. Oncoprotein-specific molecular interaction maps (SigMaps) for cancer network analyses. Nat. Biotechnol. 39, 215–224 (2021).
Sledzieski, S., Singh, R., Cowen, L. & Berger, B. D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions. Cell Syst. 12, 969–982 (2021).
Singh, R., Devkota, K., Sledzieski, S., Berger, B. & Cowen, L. Topsy-Turvy: integrating a global view into sequence-based PPI prediction. Bioinformatics 38, i264–i272 (2022).
Zhao, H., Petrey, D., Murray, D. & Honig, B. ZEPPI: Proteome-scale sequence-based evaluation of protein-protein interaction models. Proc. Natl. Acad. Sci. USA 121, e2400260121 (2024).
Szklarczyk, D. et al. The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res 49, D605–D612 (2021).
Feldman, J. & Skolnick, J. AF3Complex yields improved structural predictions of protein complexes. Bioinformatics 41, btaf432 (2025).
Morris, J. H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinforma. 12, 436 (2011).
Das, J. & Yu, H. HINT: High-quality protein interactomes and their applications in understanding human disease. BMC Syst. Biol. 6, 92 (2012).
Velez C, et al. PrePPI - Structure-based Prediction of Protein-protein Interactomes and Networks. J Mol Biol. 27, 16973 (2026).
Zhu W, Shenoy A, Kundrotas P, Elofsson A. Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes. Bioinformatics. 39 https://doi.org/10.1093/bioinformatics/btad424 (2023).
Leimkuhler, S. The biosynthesis of the molybdenum cofactors in Escherichia coli. Environ. Microbiol 22, 2007–2026 (2020).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498–2504 (2003).
Gene Ontology C, et al The Gene Ontology knowledgebase in 2023. Genetics. 224 Epub 2023/03/04. https://doi.org/10.1093/genetics/iyad031 (2023).
Yu, G. Thirteen years of clusterProfiler. Innov. 5, 100722 (2024).
UniProt C. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51:D523-D531 (2022).
Ziemann, M., Schroeter, B. & Bora, A. Two subtle problems with overrepresentation analysis. Bioinform Adv. 4, vbae159 (2024).
Garcia-Heredia, A. Plasma membrane-cell wall feedback in bacteria. J. Bacteriol. 205, e0043322 (2023).
Munhoz, D. D., Richards, A. C., Santos, F. F., Mulvey, M. A. & Piazza, R. M. F. E. coli Common pili promote the fitness and virulence of a hybrid aEPEC/ExPEC strain within diverse host environments. Gut Microbes 15, 2190308 (2023).
Wu, H. & Fives-Taylor, P. M. Molecular strategies for fimbrial expression and assembly. Crit. Rev. Oral. Biol. Med 12, 101–115 (2001).
Guo, K. & Gao, H. Physiological roles of nitrite and nitric oxide in bacteria: similar consequences from distinct cell targets, protection, and sensing systems. Adv. Biol. (Weinh.) 5, e2100773 (2021).
Gagarinova, A. et al. Auxotrophic and prototrophic conditional genetic networks reveal the rewiring of transcription factors in Escherichia coli. Nat. Commun. 13, 4085 (2022).
Anjou, C., Lotoux, A., Morvan, C. & Martin-Verstraete, I. From ubiquity to specificity: The diverse functions of bacterial thioredoxin systems. Environ. Microbiol 26, e16668 (2024).
Breuza, L. et al. The UniProtKB guide to the human proteome. Database (Oxf.). 2016, bav120 (2016).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput Biol. 7, e1002195 (2011).
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47, D309–D314 (2019).
Doane, D. Aesthetic Frequency Classifications. Am. Statis. 30, 181–183 (1976).
Carlson M. Genome-wide annotation for E coli strain K12. https://doi.org/10.18129/B9.bioc.org.EcK12.eg.db (2025).
Acknowledgments
This research was supported in part by grants R35-GM139585 (BH) and R35-GM118039 (JS) from the Division of General Medical Sciences of the National Institutes of Health. HZ acknowledges support from UTMB and the UT System Rising STARs Award. We thank Drs. Samuel Sledzieski and Rohit Singh for technical help in the installation of the D-SCRIPT and Topsy-Turvy programs, and Professors Lenore Cowen and Bonnie Berger for helpful discussions at early stages of the project.
Author information
Authors and Affiliations
Contributions
B.H., H.Z., and D.M. designed research, analyzed results, and wrote the manuscript. H.Z., D.M., C.V., and A.N. performed research. A.S. contributed software tools. J.F. and J.S. contributed to the prediction of AF3Complex models and, with CV, their analysis.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Victor Reys, Ilya Vakser, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhao, H., Velez, C., Naravane, A. et al. Combining structural modeling and deep learning to calculate the E. coli protein interactome and functional networks. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71166-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-71166-9


