Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Communications
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. nature communications
  3. articles
  4. article
Combining structural modeling and deep learning to calculate the E. coli protein interactome and functional networks
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 11 April 2026

Combining structural modeling and deep learning to calculate the E. coli protein interactome and functional networks

  • H. Zhao  ORCID: orcid.org/0000-0003-1168-57301,2,3,
  • C. Velez1,
  • A. Naravane1,
  • A. Saha  ORCID: orcid.org/0000-0003-0776-97711,
  • J. Feldman4,
  • J. Skolnick5,
  • D. Murray  ORCID: orcid.org/0000-0003-4121-15361 &
  • …
  • B. Honig  ORCID: orcid.org/0000-0002-1835-10311,6,7,8 

Nature Communications (2026) Cite this article

  • 5357 Accesses

  • 5 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Atomic force microscopy
  • Computational biophysics
  • Proteomics

Abstract

We report on the integration of three methods that predict, on a proteome-wide scale, whether two proteins are likely to form a binary complex. The methods include PrePPI, which uses three-dimensional structure information as a basis for predictions, Topsy-Turvy, which uses a protein language model, and ZEPPI, which uses evolutionary information to evaluate protein-protein interfaces. Testing on the high-quality HINT database of binary PPIs reveals that the integrated method has better performance and identifies more high-confidence interactions than any of the component methods. The AF3Complex algorithm is used to predict the structures of 374 PPIs with a large fraction having at least partially overlapping interfaces with PrePPI models of the same complex. Clustering of the high-confidence E. coli interactome yields 385 subnetworks which have high functional coherence. Biological insights derived from the subnetworks, including the annotation of proteins of unknown function, are discussed in detail.

Similar content being viewed by others

Atlas of predicted protein complex structures across kingdoms

Article Open access 25 March 2026

Unraveling cooperative and competitive interactions within protein triplets in the human interactome

Article Open access 15 September 2025

Towards a structurally resolved human protein interaction network

Article Open access 23 January 2023

Data availability

All predictions generated in this study, including genome-wide PPI predictions for human and E. coli using three different methods (PrePPI, ZEPPI, and D-Script-TT), as well as the integrated predictions derived from the Bayesian model, have been uploaded to Figshare [https://doi.org/10.6084/m9.figshare.31362145]. The PrePPI predictions can also be downloaded from the PrePPI website [https://honigcomplab.c2b2.columbia.edu/PrePPI]. Supplementary Tables S1, S2 are available. The source data underlying Supplementary Fig. S1 is provided as a Source Data file on Github repository [https://github.com/honig-lab/BayesianModel-for-Ecoli-PPI/tree/main/data].

Code availability

The code and tutorial for integrating PrePPI, ZEPPI, and D-Script-TT inputs through a Bayesian framework are available on the GitHub repository [https://github.com/honig-lab/BayesianModel-for-Ecoli-PPI] and from Zenodo [https://doi.org/10.5281/zenodo.18684873].

References

  1. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Google Scholar 

  2. Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

    Google Scholar 

  3. Evans R, et al Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv. https://doi.org/10.1101/2021.10.04.463034 (2021).

  4. Venkatesan, K. et al. An empirical framework for binary interactome mapping. Nat. Methods 6, 83–90 (2009).

    Google Scholar 

  5. Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).

    Google Scholar 

  6. Durham, J., Zhang, J., Humphreys, I. R., Pei, J. & Cong, Q. Recent advances in predicting and modeling protein-protein interactions. Trends Biochem Sci. 48, 527–538 (2023).

    Google Scholar 

  7. Cong, Q., Anishchenko, I., Ovchinnikov, S. & Baker, D. Protein interaction networks revealed by proteome coevolution. Science 365, 185–189 (2019).

    Google Scholar 

  8. Humphreys, I. R. et al. Computed structures of core eukaryotic protein complexes. Science 374, eabm4805 (2021).

    Google Scholar 

  9. Zhang, J. et al. Predicting protein-protein interactions in the human proteome. Science 390, eadt1630 (2025).

    Google Scholar 

  10. Petrey, D., Zhao, H., Trudeau, S. J., Murray, D. & Honig, B. PrePPI: A Structure Informed Proteome-wide Database of Protein-Protein Interactions. J. Mol. Biol. 435, 168052 (2023).

    Google Scholar 

  11. Garzon JI, et al A computational interactome and functional annotation for the human proteome. Elife. 5, https://doi.org/10.7554/eLife.18715 (2016).

  12. Zhang, Q. C. et al. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490, 556–560 (2012).

    Google Scholar 

  13. Burley, S. K. et al. Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive. Methods Mol. Biol. 1607, 627–641 (2016).

    Google Scholar 

  14. Lasso, G. et al. A structure-informed atlas of human-virus interactions. Cell 178, 1526–1541 (2019).

    Google Scholar 

  15. Broyde, J. et al. Oncoprotein-specific molecular interaction maps (SigMaps) for cancer network analyses. Nat. Biotechnol. 39, 215–224 (2021).

    Google Scholar 

  16. Sledzieski, S., Singh, R., Cowen, L. & Berger, B. D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions. Cell Syst. 12, 969–982 (2021).

    Google Scholar 

  17. Singh, R., Devkota, K., Sledzieski, S., Berger, B. & Cowen, L. Topsy-Turvy: integrating a global view into sequence-based PPI prediction. Bioinformatics 38, i264–i272 (2022).

    Google Scholar 

  18. Zhao, H., Petrey, D., Murray, D. & Honig, B. ZEPPI: Proteome-scale sequence-based evaluation of protein-protein interaction models. Proc. Natl. Acad. Sci. USA 121, e2400260121 (2024).

    Google Scholar 

  19. Szklarczyk, D. et al. The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res 49, D605–D612 (2021).

    Google Scholar 

  20. Feldman, J. & Skolnick, J. AF3Complex yields improved structural predictions of protein complexes. Bioinformatics 41, btaf432 (2025).

    Google Scholar 

  21. Morris, J. H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinforma. 12, 436 (2011).

    Google Scholar 

  22. Das, J. & Yu, H. HINT: High-quality protein interactomes and their applications in understanding human disease. BMC Syst. Biol. 6, 92 (2012).

    Google Scholar 

  23. Velez C, et al. PrePPI - Structure-based Prediction of Protein-protein Interactomes and Networks. J Mol Biol. 27, 16973 (2026).

  24. Zhu W, Shenoy A, Kundrotas P, Elofsson A. Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes. Bioinformatics. 39 https://doi.org/10.1093/bioinformatics/btad424 (2023).

  25. Leimkuhler, S. The biosynthesis of the molybdenum cofactors in Escherichia coli. Environ. Microbiol 22, 2007–2026 (2020).

    Google Scholar 

  26. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498–2504 (2003).

    Google Scholar 

  27. Gene Ontology C, et al The Gene Ontology knowledgebase in 2023. Genetics. 224 Epub 2023/03/04. https://doi.org/10.1093/genetics/iyad031 (2023).

  28. Yu, G. Thirteen years of clusterProfiler. Innov. 5, 100722 (2024).

    Google Scholar 

  29. UniProt C. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51:D523-D531 (2022).

  30. Ziemann, M., Schroeter, B. & Bora, A. Two subtle problems with overrepresentation analysis. Bioinform Adv. 4, vbae159 (2024).

    Google Scholar 

  31. Garcia-Heredia, A. Plasma membrane-cell wall feedback in bacteria. J. Bacteriol. 205, e0043322 (2023).

    Google Scholar 

  32. Munhoz, D. D., Richards, A. C., Santos, F. F., Mulvey, M. A. & Piazza, R. M. F. E. coli Common pili promote the fitness and virulence of a hybrid aEPEC/ExPEC strain within diverse host environments. Gut Microbes 15, 2190308 (2023).

    Google Scholar 

  33. Wu, H. & Fives-Taylor, P. M. Molecular strategies for fimbrial expression and assembly. Crit. Rev. Oral. Biol. Med 12, 101–115 (2001).

    Google Scholar 

  34. Guo, K. & Gao, H. Physiological roles of nitrite and nitric oxide in bacteria: similar consequences from distinct cell targets, protection, and sensing systems. Adv. Biol. (Weinh.) 5, e2100773 (2021).

    Google Scholar 

  35. Gagarinova, A. et al. Auxotrophic and prototrophic conditional genetic networks reveal the rewiring of transcription factors in Escherichia coli. Nat. Commun. 13, 4085 (2022).

    Google Scholar 

  36. Anjou, C., Lotoux, A., Morvan, C. & Martin-Verstraete, I. From ubiquity to specificity: The diverse functions of bacterial thioredoxin systems. Environ. Microbiol 26, e16668 (2024).

    Google Scholar 

  37. Breuza, L. et al. The UniProtKB guide to the human proteome. Database (Oxf.). 2016, bav120 (2016).

    Google Scholar 

  38. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput Biol. 7, e1002195 (2011).

    Google Scholar 

  39. Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47, D309–D314 (2019).

    Google Scholar 

  40. Doane, D. Aesthetic Frequency Classifications. Am. Statis. 30, 181–183 (1976).

    Google Scholar 

  41. Carlson M. Genome-wide annotation for E coli strain K12. https://doi.org/10.18129/B9.bioc.org.EcK12.eg.db (2025).

Download references

Acknowledgments

This research was supported in part by grants R35-GM139585 (BH) and R35-GM118039 (JS) from the Division of General Medical Sciences of the National Institutes of Health. HZ acknowledges support from UTMB and the UT System Rising STARs Award. We thank Drs. Samuel Sledzieski and Rohit Singh for technical help in the installation of the D-SCRIPT and Topsy-Turvy programs, and Professors Lenore Cowen and Bonnie Berger for helpful discussions at early stages of the project.

Author information

Authors and Affiliations

  1. Department of Systems Biology, Columbia University Irving Medical Center, 1130 St Nicholas Ave, New York, NY, USA

    H. Zhao, C. Velez, A. Naravane, A. Saha, D. Murray & B. Honig

  2. Department of Biochemistry & Molecular Biology, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, USA

    H. Zhao

  3. Sealy Center for Structural Biology and Molecular Biophysics, The University of Texas Medical Branch, 301 University Boulevard, Galveston, TX, USA

    H. Zhao

  4. School of Computer Science, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA, USA

    J. Feldman

  5. Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, N.W., Atlanta, USA

    J. Skolnick

  6. Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, 701 W 168th Street, New York, NY, USA

    B. Honig

  7. Department of Medicine, Columbia University Irving Medical Center, 630 W 168th Street, New York, NY, USA

    B. Honig

  8. Zuckerman Mind Brain and Behavior Institute, Columbia University, 3227 Broadway, New York, NY, USA

    B. Honig

Authors
  1. H. Zhao
    View author publications

    Search author on:PubMed Google Scholar

  2. C. Velez
    View author publications

    Search author on:PubMed Google Scholar

  3. A. Naravane
    View author publications

    Search author on:PubMed Google Scholar

  4. A. Saha
    View author publications

    Search author on:PubMed Google Scholar

  5. J. Feldman
    View author publications

    Search author on:PubMed Google Scholar

  6. J. Skolnick
    View author publications

    Search author on:PubMed Google Scholar

  7. D. Murray
    View author publications

    Search author on:PubMed Google Scholar

  8. B. Honig
    View author publications

    Search author on:PubMed Google Scholar

Contributions

B.H., H.Z., and D.M. designed research, analyzed results, and wrote the manuscript. H.Z., D.M., C.V., and A.N. performed research. A.S. contributed software tools. J.F. and J.S. contributed to the prediction of AF3Complex models and, with CV, their analysis.

Corresponding authors

Correspondence to J. Skolnick, D. Murray or B. Honig.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Victor Reys, Ilya Vakser, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary Files (download PDF )

Supplementary Data 1 (download XLSX )

Supplementary Data 2 (download XLSX )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, H., Velez, C., Naravane, A. et al. Combining structural modeling and deep learning to calculate the E. coli protein interactome and functional networks. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71166-9

Download citation

  • Received: 17 June 2025

  • Accepted: 13 March 2026

  • Published: 11 April 2026

  • DOI: https://doi.org/10.1038/s41467-026-71166-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Videos
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Editors
  • Journal Information
  • Open Access Fees and Funding
  • Calls for Papers
  • Editorial Values Statement
  • Journal Metrics
  • Editors' Highlights
  • Contact
  • Editorial policies
  • Top Articles

Publish with us

  • For authors
  • For Reviewers
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Communications (Nat Commun)

ISSN 2041-1723 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research