Abstract
Small open reading frames (smORFs), which encode proteins under 100 amino acids, represent an underexplored dimension of the human gut microbiome, despite growing evidence of their essential biological roles. Due to small size and poor annotation, smORFs are typically excluded from metagenomic/metaproteomic analyses. Here, we present a high-resolution multi-omic workflow that integrates smORF prediction into metaproteome searches and enables ultra-deep detection of smORF-encoded proteins (SEPs), without experimental size-based enrichment, utilizing state-of-the-art mass spectrometry instrumentation. Applied to human gut microbiomes, this approach resulted in the largest number of detected SEPs to date, allowing identification of over 25,000 SEPs in the metaproteome, alongside the measurements of the larger proteins. Our multi-omics integrative strategy is critical for advancing human metaproteome research. It also provides a generalizable strategy for comprehensive SEP discovery across diverse microbial ecosystems greatly expanding the previously hidden proteomic landscape.
Similar content being viewed by others
Acknowledgements
We thank the staff of the Luxembourg Center for Systems Biomedicine (LCSB), particularly the sequencing platform, for running the sequencing analysis. The bioinformatics analyses presented in this paper were carried out using the HPC facilities of the University of Luxembourg68. We also thank Richard J. Giannone from the bioanalytical mass spectrometry group at Oak Ridge National Laboratory for technical insight and internal review of this project. Senior authors P.W. and R.L.H. disclose support for the research of this work to all co-authors from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program [grant agreement No. 863664] and P.W. further acknowledges support from the Luxembourg National Research Fund [grant number C23/BM/18091896/Infectome]. Co-first authors M.E.D. and J.O.S. disclose individual support, with M.E.D. supported by the National Science Foundation Graduate Research Fellowship Program and J.O.S. supported by a Pélican Grant from the Fondation de Luxembourg. P.W. further discloses individual support from a Fulbright Research Scholarship from the Commission for Educational Exchange between the United States, Belgium, and Luxembourg.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
All authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Davin, M.E., Ortís Sunyer, J., Delgado, L.F. et al. High-resolution multi-omics enhances prediction and detection of smORF-encoded proteins in the human gut microbiome. Nat Commun (2026). https://doi.org/10.1038/s41467-026-72762-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-72762-5


