Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Communications Biology
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. communications biology
  3. articles
  4. article
A multi-way SMILES-based hypergraph inference network for metabolic model reconstruction
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 05 March 2026

A multi-way SMILES-based hypergraph inference network for metabolic model reconstruction

  • Yanlong Zhao1 na1,
  • Yixiao Chen  ORCID: orcid.org/0009-0009-1526-16292 na1,
  • Yi Yu3,
  • Xiang Liu  ORCID: orcid.org/0009-0009-8978-84663,
  • Jiawen Du  ORCID: orcid.org/0000-0003-3711-81014,
  • Jun Wen5,6,
  • Quan Sun7,8,
  • Ren Wang  ORCID: orcid.org/0000-0002-6366-88983 &
  • …
  • Can Chen  ORCID: orcid.org/0000-0003-2310-00744,9,10,11 

Communications Biology , Article number:  (2026) Cite this article

  • 2626 Accesses

  • 1 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Machine learning
  • Metabolic engineering

Abstract

Genome-scale metabolic models (GEMs) are indispensable tools for probing cellular metabolism, enabling predictions of metabolic fluxes, guiding strain optimization, and advancing biomedical research. However, their predictive capacity is often compromised by incomplete reaction networks, stemming from gaps in biochemical knowledge, annotation inaccuracies, and insufficient experimental validations. Here we present MuSHIN (Multi-way SMILES-based Hypergraph Interface Network), a deep hypergraph learning method that integrates network topology with biochemical domain knowledge to predict missing reactions in GEMs. Evaluated on 926 high- and intermediate-quality GEMs with artificially removed reactions, MuSHIN achieves up to a 17% improvement over the current state-of-the-art method across multiple evaluation metrics. Furthermore, MuSHIN substantially enhances phenotypic predictions in 24 draft GEMs associated with fermentation by resolving critical metabolic gaps, as validated against experimental measurements. Together, these findings highlight MuSHIN’s potential to advance GEM reconstruction and accelerate discoveries in systems biology, metabolic engineering, and precision medicine.

Similar content being viewed by others

Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning

Article Open access 25 April 2023

Exploratory analysis of metabolic changes using mass spectrometry data and graph embeddings

Article Open access 28 November 2024

Modeling tissue-specific Drosophila metabolism identifies high sugar diet-induced metabolic dysregulation in muscle at reaction and pathway levels

Article Open access 19 January 2026

Data availability

The datasets used and analyzed during the current study are included within this article and its Supplementary Information file. The raw data were collected from publicly available databases: ChEBI (https://www.ebi.ac.uk/chebi/), BiGG Models (http://bigg.ucsd.edu/), AGORA Models (https://www.vmh.life). The source data for the figures are provided in the Supplementary Data file. More details can be found in Supplementary Note 5.

Code availability

The source code for our framework is available at Github51 [https://github.com/cyixiao/MuSHIN].

References

  1. Thiele, I., Price, N. D., Vo, T. D. & Palsson, B. Ø Candidate metabolic network states in human mitochondria: Impact of diabetes, ischemia, and diet. J. Biol. Chem. 280, 11683–11695 (2005).

    Google Scholar 

  2. Thiele, I., Jamshidi, N., Fleming, R. M. & Palsson, B. Ø Genome-scale reconstruction of Escherichia coli’s transcriptional and translational machinery: a knowledge base, its mathematical formulation, and its functional characterization. PLoS Comput. Biol. 5, e1000312 (2009).

    Google Scholar 

  3. Lee, S. Y. & Kim, H. U. Systems strategies for developing industrial microbial strains. Nat. Biotechnol. 33, 1061–1072 (2015).

    Google Scholar 

  4. Gu, C., Kim, G. B., Kim, W. J., Kim, H. U. & Lee, S. Y. Current status and applications of genome-scale metabolic models. Genome Biol. 20, 121 (2019).

    Google Scholar 

  5. Lieven, C. et al. Memote for standardized genome-scale metabolic model testing. Nat. Biotechnol. 38, 272–276 (2020).

    Google Scholar 

  6. Simeonidis, E. & Price, N. D. Genome-scale modeling for metabolic engineering. J. Ind. Microbiol. Biotechnol. 42, 327–338 (2015).

    Google Scholar 

  7. Kim, B., Kim, W. J., Kim, D. I. & Lee, S. Y. Applications of genome-scale metabolic network model in metabolic engineering. J. Ind. Microbiol. Biotechnol. 42, 339–348 (2015).

    Google Scholar 

  8. Raškevičius, V. et al. Genome scale metabolic models as tools for drug design and personalized medicine. PloS One 13, e0190636 (2018).

    Google Scholar 

  9. Robinson, J. L. & Nielsen, J. Anticancer drug discovery through genome-scale metabolic modeling. Curr. Opin. Syst. Biol. 4, 1–8 (2017).

    Google Scholar 

  10. King, Z. A. et al. Bigg models: a platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res. 44, D515–D522 (2016).

    Google Scholar 

  11. Nielsen, J. & Keasling, J. D. Engineering cellular metabolism. Cell 164, 1185–1197 (2016).

    Google Scholar 

  12. Chen, C., Liao, C. & Liu, Y.-Y. Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning. Nat. Commun. 14, 2375 (2023).

    Google Scholar 

  13. Liu, X. et al. A generalizable framework for unlocking missing reactions in genome-scale metabolic networks using deep learning. arXiv preprint arXiv:2409.13259 (2024).

  14. Pan, S. & Reed, J. L. Advances in gap-filling genome-scale metabolic models and model-driven experiments lead to novel metabolic discoveries. Curr. Opin. Biotechnol. 51, 103–108 (2018).

    Google Scholar 

  15. Benedict, M. N., Mundy, M. B., Henry, C. S., Chia, N. & Price, N. D. Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models. PLoS Comput. Biol. 10, e1003882 (2014).

    Google Scholar 

  16. Karp, P. D., Weaver, D. & Latendresse, M. How accurate is automated gap filling of metabolic models? BMC Syst. Biol. 12, 73 (2018).

    Google Scholar 

  17. Orth, J. D. & Palsson, B. Ø Systematizing the generation of missing metabolic knowledge. Biotechnol. Bioeng. 107, 403–412 (2010).

    Google Scholar 

  18. Schroeder, W. L. & Saha, R. Optfill: a tool for infeasible cycle-free gapfilling of stoichiometric metabolic models. IScience 23, 100783 (2020).

  19. Prigent, S. et al. Meneco, a topology-based gap-filling tool applicable to degraded genome-wide metabolic networks. PLoS Comput. Biol. 13, e1005276 (2017).

  20. Satish Kumar, V., Dasika, M. S. & Maranas, C. D. Optimization based automated curation of metabolic reconstructions. BMC Bioinforma. 8, 212 (2007).

    Google Scholar 

  21. Henry, C. S. et al. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat. Biotechnol. 28, 977–982 (2010).

    Google Scholar 

  22. Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).

    Google Scholar 

  23. Machado, D., Andrejev, S., Tramontano, M. & Patil, K. R. Fast automated reconstruction of genome-scale metabolic models for microbial species and communities. Nucleic Acids Res. 46, 7542–7553 (2018).

    Google Scholar 

  24. Chen, C. & Liu, Y.-Y. A survey on hyperlink prediction. IEEE Trans. Neural Netw. Learn. Syst. 35, 15034–15050 (2023).

    Google Scholar 

  25. Feng, Y., You, H., Zhang, Z., Ji, R. & Gao, Y. Hypergraph neural networks. Proc. AAAI Conf. Artif. Intell. 33, 3558–3565 (2019).

    Google Scholar 

  26. Bai, S., Zhang, F. & Torr, P. H. Hypergraph convolution and hypergraph attention. Pattern Recognit. 110, 107637 (2021).

    Google Scholar 

  27. Chen, C., Surana, A., Bloch, A. M. & Rajapakse, I. Controllability of hypergraphs. IEEE Trans. Netw. Sci. Eng. 8, 1646–1657 (2021).

    Google Scholar 

  28. Chen, C. & Rajapakse, I. Tensor entropy for uniform hypergraphs. IEEE Trans. Netw. Sci. Eng. 7, 2889–2900 (2020).

    Google Scholar 

  29. Berge, C. Hypergraphs: Combinatorics of Finite Sets Vol. 45 (Elsevier, 1984).

  30. Zhou, D., Huang, J. & Schölkopf, B. Learning with hypergraphs: clustering, classification, and embedding. Adv. Neural Inform. Process. Syst. 19, 1601–1608 (2006).

  31. Gao, Y. et al. Hypergraph learning: methods and practices. IEEE Trans. Pattern Anal. Mach. Intell. 44, 2548–2566 (2020).

    Google Scholar 

  32. Zhang, M., Cui, Z., Jiang, S. & Chen, Y. Beyond link prediction: Predicting hyperlinks in adjacency space. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 32, (2018).

  33. Sharma, G., Patil, P. & Murty, M. N. C3mm: clique-closure based hyperlink prediction. IJCAI 20, 3364–3370 (2020).

    Google Scholar 

  34. Yadati, N. et al. Nhp: Neural hypergraph link prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 1705–1714 (2020).

  35. Chithrananda, S., Grand, G. & Ramsundar, B. Chemberta: large-scale self-supervised pretraining for molecular property prediction. arXiv preprint arXiv:2010.09885 (2020).

  36. Schwaller, P. et al. Mapping the space of chemical reactions using attention-based neural networks. Nat. Mach. Intell. 3, 144–152 (2021).

    Google Scholar 

  37. Vaswani, A. et al. Attention is all you need. Adv. Neural Inform. Process. Syst. 30, 6000–6010 (2017).

  38. Magnúsdóttir, S. et al. Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota. Nat. Biotechnol. 35, 81–89 (2017).

    Google Scholar 

  39. Oyetunde, T., Zhang, M., Chen, Y., Tang, Y. & Lo, C. Boostgapfill: improving the fidelity of metabolic network reconstructions through integrated constraint and pattern-based methods. Bioinformatics 33, 608–611 (2017).

    Google Scholar 

  40. Bernstein, D. B., Sulheim, S., Almaas, E. & Segrè, D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol. 22, 64 (2021).

    Google Scholar 

  41. Lü, W. et al. The formate channel foca exports the products of mixed-acid fermentation. Proc. Natl. Acad. Sci. USA 109, 13254–13259 (2012).

    Google Scholar 

  42. van ’t Hof, M. et al. High-quality genome-scale metabolic network reconstruction of probiotic bacterium Escherichia coli nissle 1917. BMC Bioinforma. 23, 566 (2022).

    Google Scholar 

  43. Bu, X. et al. Engineering endogenous ABC transporter with improving ATP supply and membrane flexibility enhances the secretion of β-carotene in Saccharomyces cerevisiae. Biotechnol. Biofuels 13, 168 (2020).

    Google Scholar 

  44. Danchin, A. Zinc, an unexpected integrator of metabolism? Microb. Biotechnol. 13, 895–898 (2020).

    Google Scholar 

  45. Hastings, J. et al. The CHEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 41, D456–D463 (2012).

    Google Scholar 

  46. Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781 (2013).

  47. Gao, Y., Feng, Y., Ji, S. & Ji, R. Hgnn+: general hypergraph neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3181–3199 (2022).

    Google Scholar 

  48. Jiang, J., Wei, Y., Feng, Y., Cao, J. & Gao, Y. Dynamic hypergraph neural networks. In IJCAI 2635–2641 (2019).

  49. Kim, S. et al. A survey on hypergraph neural networks: an in-depth and step-by-step guide. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 6534–6544 (2024).

  50. Yi, J. & Park, J. Hypergraph convolutional recurrent neural network. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 3366–3376 (2020).

  51. Chen, Y. & Zhao, Y. cyixiao/mushin: mushin (2026). Zenodo. Version v1.0. https://doi.org/10.5281/zenodo.18427362 (2026).

Download references

Acknowledgements

The authors would like to thank Dr. Chen Liao from Dartmouth College for his contributions to understanding MuSHIN’s metabolic gap-filling processes during the phenotypic prediction experiment.

Author information

Author notes
  1. These authors contributed equally: Yanlong Zhao, Yixiao Chen.

Authors and Affiliations

  1. Department of Electrical Computer Engineering, University of Rochester, Rochester, NY, USA

    Yanlong Zhao

  2. Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

    Yixiao Chen

  3. Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL, USA

    Yi Yu, Xiang Liu & Ren Wang

  4. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

    Jiawen Du & Can Chen

  5. Harvard Medical School, Harvard University, Boston, MA, USA

    Jun Wen

  6. Department of Computational Biology, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE

    Jun Wen

  7. Center for Computational and Genomic Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA

    Quan Sun

  8. Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA

    Quan Sun

  9. School of Data Science and Society, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

    Can Chen

  10. Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

    Can Chen

  11. Carolina Health Informatics Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

    Can Chen

Authors
  1. Yanlong Zhao
    View author publications

    Search author on:PubMed Google Scholar

  2. Yixiao Chen
    View author publications

    Search author on:PubMed Google Scholar

  3. Yi Yu
    View author publications

    Search author on:PubMed Google Scholar

  4. Xiang Liu
    View author publications

    Search author on:PubMed Google Scholar

  5. Jiawen Du
    View author publications

    Search author on:PubMed Google Scholar

  6. Jun Wen
    View author publications

    Search author on:PubMed Google Scholar

  7. Quan Sun
    View author publications

    Search author on:PubMed Google Scholar

  8. Ren Wang
    View author publications

    Search author on:PubMed Google Scholar

  9. Can Chen
    View author publications

    Search author on:PubMed Google Scholar

Contributions

C.C. conceived and designed the project. Y.Z., Y.C., and X.L. developed the MuSHIN algorithm. Y.C. performed the internal validation. Y.Z. performed the external validation. Y.Z. and Y.C. interpreted the results. Y.C., Y.Z., Y.Y., and J.D. prepared the manuscript. J.W., Q.S., R.W., and C.C. edited and approved the manuscript.

Corresponding authors

Correspondence to Ren Wang or Can Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Ove Øyås and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Silvio Waschina and Laura Rodríguez. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file (download PDF )

Supplementary Information (download PDF )

Description of Additional Supplementary files (download PDF )

Supplementary Data (download ZIP )

Reporting Summary (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Chen, Y., Yu, Y. et al. A multi-way SMILES-based hypergraph inference network for metabolic model reconstruction. Commun Biol (2026). https://doi.org/10.1038/s42003-026-09761-1

Download citation

  • Received: 10 July 2025

  • Accepted: 17 February 2026

  • Published: 05 March 2026

  • DOI: https://doi.org/10.1038/s42003-026-09761-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information
  • Open Access Fees and Funding
  • Journal Metrics
  • Editors
  • Editorial Board
  • Calls for Papers
  • Referees
  • Contact
  • Editorial policies
  • Aims & Scope

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Communications Biology (Commun Biol)

ISSN 2399-3642 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research