Abstract
Using biomedical foundation models (FMs) for inference on small cohorts, represents a promising and practical route to advance drug response biomarker discovery and target identification. Here, we demonstrate this via an innovative data-driven inference workflow, using a fine-tuned, biomedical FM. We study multi-omics (genomic, transcriptomic) data and predict pharmacological responses, both from surgical diseased tissue of inflammatory bowel disease (IBD) patients. We use FM inference to inform feature selection and feature engineering strategies, where FM-derived features provide advantage for predicting IBD patient drug response and target identification. Firstly, calculating drug-target binding affinity (BA), enabling prioritisation of protein/gene targets and associated SNPs for drugs of interest. Secondly, using patient SNPs to mutate reference proteins and assess impact on drug BA. Thirdly, building strategies to fuse BAs and transcriptomics. Additionally, we created an open-source Model Context Protocol server, making our FM inference example accessible to the community via AI agents and natural language prompts.
Data availability
The dataset that was generated during the current study and used to train our best ML model is available in Supplementary File 3 (calculated Binding affinities, TMM values and interaction states, 1230 features for the 51 patients). The sequences used with our example AI agent workflow are also available in Supplementary Files 1 and 2 (fasta sequences available as txt files). Where consent has been provided, the raw RNA datasets analysed during the current study are available in the NCBI-SRA repository, under project ID PRJEB43220. Further data and metadata are available through a controlled access route to maintain the requirements of ethical compliance. Applications for access can made through the manuscript authors affiliated with REPROCELL-Europe Ltd.
Code availability
The MCP server code to access our FM inference task is open-source at: https://github.com/BiomedSciAI/biomed-multi-alignment/tree/main/mammal_mcp. The code to carry out our bioinformatics processing (genomics, SNP annotation, mutation of proteins, RNA-seq analysis) is all based on open source tools and packages that are detailed in the methods.
References
Vaswani, A. et al. Attention Is All You Need. Preprint at https://doi.org/10.48550/ARXIV.1706.03762 (2017).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint at https://doi.org/10.48550/ARXIV.1810.04805 (2018).
Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: Pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
Dandala, B. et al. BMFM-RNA: an open framework for building and evaluating transcriptomic foundation models. Preprint at https://doi.org/10.48550/ARXIV.2506.14861 (2025).
Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).
Cui, H. et al. scGPT: Toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Shoshan, Y. et al. MAMMAL -- Molecular aligned multi-modal architecture and language. Preprint at https://doi.org/10.48550/ARXIV.2410.22367 (2024).
Stankey, C. T. et al. A disease-associated gene desert directs macrophage inflammation through ETS2. Nature 630, 447–456 (2024).
King, E. A., Davis, J. W. & Degner, J. F. Are drug targets with genetic support twice as likely to be approved? Revised estimates of the impact of genetic support for drug mechanisms on the probability of drug approval. PLoS Genet. 15, e1008489 (2019).
De Lange, K. M. et al. Genome-wide association study implicates immune activation of multiple integrin genes in Inflammatory Bowel Disease. Nat. Genet. 49, 256–261 (2017).
Kumar, M., Garand, M. & Al Khodor, S. Integrating omics for a better understanding of Inflammatory Bowel Disease: A step towards personalized medicine. J. Transl. Med. 17, 419 (2019).
Bradley, J. TNF-mediated inflammatory disease. J. Pathol. 214, 149–160 (2008).
Plevy, S. E. et al. A role for TNF-alpha and mucosal T helper-1 cytokines in the pathogenesis of Crohn’s disease. J. Immunol. 159, 6276–6282 (1997).
Olsen, T. et al. Tissue levels of tumor necrosis factor-alpha correlates with grade of inflammation in untreated ulcerative colitis. Scand. J. Gastroenterol. 42, 1312–1320 (2007).
Mandrekar, S. J. & Sargent, D. J. All-comers versus enrichment design strategy in Phase II trials. J. Thorac. Oncol. 6, 658–660 (2011).
Gardiner, L.-J. et al. Combining explainable machine learning, demographic and multi-omic data to inform precision medicine strategies for inflammatory bowel disease. PLoS ONE 17, e0263248 (2022).
Flegel, C., Manteniotis, S., Osthold, S., Hatt, H. & Gisselmann, G. Expression profile of ectopic olfactory receptors determined by deep sequencing. PLoS ONE 8, e55368 (2013).
Evans, P. D., Bayliss, A. & Reale, V. GPCR-mediated rapid, non-genomic actions of steroids: Comparisons between DmDopEcR and GPER1 (GPR30). Gen. Comp. Endocrinol. 195, 157–163 (2014).
Kaplan, I., Blakely, B. T., Pavlath, G. K., Travis, M. & Blau, H. M. Steroids induce acetylcholine receptors on cultured human muscle: Implications for myasthenia gravis. Proc. Natl. Acad. Sci. U.S.A. 87, 8100–8104 (1990).
Strudwick, J. et al. AutoXAI4Omics: An automated explainable AI tool for omics and tabular data. Brief. Bioinform. 26, bbae593 (2024).
Sollai, G. et al. Olfactory function in patients with inflammatory bowel disease (IBD) is associated with their body mass index and polymorphism in the odor binding-protein (OBPIIa) gene. Nutrients 13, 703 (2021).
Banerjee, P. et al. Association study identified biologically relevant receptor genes with synergistic functions in celiac disease. Sci. Rep. 9, 13811 (2019).
Auteri, M., Zizzo, M. G. & Serio, R. GABA and GABA receptors in the gastrointestinal tract: From motility to inflammation. Pharmacol. Res. 93, 11–21 (2015).
Sinkkonen, S. T., Hanna, M. C., Kirkness, E. F. & Korpi, E. R. GABAA receptor ε and θ subunits display unusual structural variation between species and are enriched in the rat Locus Ceruleus. J. Neurosci. 20, 3588–3595 (2000).
Phulera, S. et al. Cryo-EM structure of the benzodiazepine-sensitive α1β1γ2S tri-heteromeric GABAA receptor in complex with GABA. Elife 7, e39383 (2018).
So, S. Y. & Savidge, T. C. Gut feelings: The microbiota-gut-brain axis on steroids. Am. J. Physiol. Gastrointest. Liver Physiol. 322, G1–G20 (2022).
Tetel, M. J., De Vries, G. J., Melcangi, R. C., Panzica, G. & O’Mahony, S. M. Steroids, stress and the gut microbiome‐brain axis. J. Neuroendocrinol. 30, e12548 (2018).
McCurry, M. D. et al. Gut bacteria convert glucocorticoids into progestins in the presence of hydrogen gas. Cell 187, 2952-2968.e13 (2024).
De Lima, A. M. D. L. et al. Effect of prednisolone in a kindling model of epileptic seizures in rats on cytokine and intestinal microbiota diversity. Epilepsy Behav. 155, 109800 (2024).
Di, S., Maxson, M. M., Franco, A. & Tasker, J. G. Glucocorticoids regulate glutamate and GABA synapse-specific retrograde transmission via divergent nongenomic signaling pathways. J. Neurosci. 29, 393–401 (2009).
Deng, Z. et al. Activation of GABA receptor attenuates intestinal inflammation by modulating enteric glial cells function through inhibiting NF-κB pathway. Life Sci. 329, 121984 (2023).
Ma, X. et al. Activation of GABAA receptors in colon epithelium exacerbates acute colitis. Front. Immunol. 9, 987 (2018).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Patel, H. et al. nf-core/rnaseq: nf-core/rnaseq v3.21.0 - Mercury Macaw. Zenodo https://doi.org/10.5281/ZENODO.1400710 (2025).
Pedregosa, F. et al. Scikit-learn machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Elizarraras, J. M. et al. WebGestalt 2024: Faster gene set analysis and new support for metabolomics and multi-omics. Nucleic Acids Res. 52, W415–W421 (2024).
Acknowledgements
This work was supported by the Hartree National Centre for Digital Innovation (HNCDI), a collaboration between STFC and IBM.
Funding
This work was supported by the Hartree National Centre for Digital Innovation, a collaboration between the Science and Technologies facilities Council (STFC) and IBM (LJG, J.K, A.E, S.C) that is funded by the UK Research and Innovation (UKRI) funding agency. The functional pharmacology experiments and whole exome sequencing for this study were part-funded via a project grant from Precision Medicine Scotland Innovation Centre (PMS-IC), provided to REPROCELL Europe Ltd (K.B, G.M, D.B). PMS-IC is funded by the Scottish Funding Council and Scottish Enterprise. The funding organisations did not play an additional role in the study design, data collection and analysis, or preparation of the manuscript and only provided financial support in the form of authors’ salaries and research materials.
Author information
Authors and Affiliations
Contributions
All authors contributed code, bioinformatics support, and/or study conceptualization. L.J.G. performed primary conceptualisation, analyses and manuscript writing. J.K. performed MCP server development and interaction state determination. A.E. performed MCP server development. S.C. performed transcriptomics bioinformatics. L.J.G, J.K., A.E., G.M., D.B., K.B., performed manuscript review and editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
The research was conducted with the approval of the West of Scotland Research Ethics Committee (approvals 12/ws/0069, 17/WS/0049 and 22/WS/0007) and the Advarra Institutional Review Board (Protocol ID: Pro00005300). All procedures performed in this study involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Informed consent and written consent was obtained from all individual participants involved in the study.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Gardiner, LJ., Kelly, J., Evans, A. et al. Multi-omics feature engineering driven by biomedical foundation models improves drug response prediction for inflammatory bowel disease patients. Sci Rep (2026). https://doi.org/10.1038/s41598-026-44366-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-44366-y