PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks

Peng, Fred Zhangzhi; Wang, Chentong; Chen, Tong; Schussheim, Benjamin; Vincoff, Sophia; Chatterjee, Pranam

doi:10.1038/s41592-025-02656-9

Download PDF

Brief Communication
Open access
Published: 10 April 2025

PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks

Fred Zhangzhi Peng¹,
Chentong Wang²,
Tong Chen¹,
Benjamin Schussheim³,
Sophia Vincoff¹ &
…
Pranam Chatterjee ORCID: orcid.org/0000-0003-3957-8478^1,3,4

Nature Methods volume 22, pages 945–949 (2025)Cite this article

20k Accesses
17 Citations
22 Altmetric
Metrics details

Subjects

Abstract

Current protein language models (LMs) accurately encode protein properties but have yet to represent post-translational modifications (PTMs), which are crucial for proteomic diversity and influence protein structure, function and interactions. To address this gap, we develop PTM-Mamba, a PTM-aware protein LM that integrates PTM tokens using bidirectional Mamba blocks fused with ESM-2 protein LM embeddings via a newly developed gating mechanism. PTM-Mamba uniquely models both wild-type and PTM sequences, enabling downstream tasks such as disease association and druggability prediction, PTM effect prediction on protein–protein interactions and zero-shot PTM discovery. In total, our work establishes PTM-Mamba as a foundational tool for PTM-aware protein modeling and design.

Post-translational modification prediction via prompt-based fine-tuning of a GPT-2 model

Article Open access 07 August 2024

Leveraging ancestral sequence reconstruction for protein representation learning

Article 18 December 2024

Linguistically inspired roadmap for building biologically reliable protein language models

Article 06 April 2023

Main

PTMs, such as phosphorylation, acetylation, ubiquitination and glycosylation, vastly expand the functional diversity of eukaryotic proteomes, influencing essential processes like enzyme activity, protein turnover, signaling cascades and DNA repair^1,2. Dysregulation of PTMs often leads to severe diseases, including cancer, neurodegeneration and aging^2,3. For example, phosphorylation of STAT3 at specific residues transforms it from a typical transcription factor into a driver of tumorigenesis and metastasis in various cancers^4,5. Understanding and modeling the unique sequence features of post-translationally modified proteins is therefore crucial for advancing proteome-wide insights and therapeutic design. Protein LMs have emerged as transformative tools for encoding physicochemical and functional information in protein sequences⁶. Models like ESM-2 and ProtT5 excel at sequence representation, whereas autoregressive protein LMs like ProGen and ProtGPT2 generate functional proteins^7,8,9,10. From a therapeutic context, our generative language models, such as SaLT&PepPr, PepPrCLIP, PepMLM and moPPIt, have enabled the design of peptides that bind and degrade specific targets, including disordered proteins^11,12,13,14. However, existing protein LMs entirely exclude PTM residues from their training and inference pipelines^7,8,9,10, limiting their ability to model PTM-specific effects.

We hypothesized that combining ESM-2 embeddings with a specialized framework for handling PTM tokens would enable accurate modeling of both wild-type residues and PTMs. To test this, we curated a training dataset of 79,707 modified sequences, constructed from 311,350 experimentally validated PTM records in Swiss-Prot¹⁵. We specifically mapped PTM annotations to their respective protein sequences, ensuring a diverse representation of PTM types (Supplementary Fig. 1) and sequence lengths (Supplementary Fig. 2).

We based our PTM protein LM on Mamba, a structured state-space model that offers computational efficiency and flexibility through a selective state-space architecture, which provides subquadratic time and memory complexity with sequence length¹⁶. Additionally, Mamba uses hardware-aware primitives, such as parallelized state transitions and convolutional projections, to accelerate computations without affecting scaling¹⁶. Although Mamba’s original design for autoregressive text generation limited its ability to capture full sequence semantics, we adapted it for bidirectional modeling by introducing forward and backward processing layers. The resulting bidirectional Mamba block (Fig. 1a and code snippet below) processes the sequence in two directions: a forward pass (left to right) and a backward pass (right to left). Each pass independently generates hidden states through its respective state-space layer, and the outputs are concatenated before being fused by a fully connected layer to generate a combined representation. Residual connections are applied to both the forward and backward layers, and their contributions are averaged to retain both directional contexts, ensuring comprehensive modeling of sequence dependencies for amino acids and PTMs.

**Fig. 1: Architecture and embedding visualization of PTM-Mamba.**

def bidirectional_mamba(self, hidden_states): residual = None for f_layer, b_layer, h_fc in zip( self.forward_layers, self.backward_layers, self.hidden_fc ): hidden_states_f, residual_f = f_layer( hidden_states, residual, ) flip_residual = residual.flip([1]) if residual is not None else None hidden_states_b, residual_b = b_layer( hidden_states.flip([1]), flip_residual, ) hidden_states = h_fc( torch.cat([hidden_states_f, hidden_states_b.flip([1])], dim=-1) ) residual = 0.5 * (residual_f + residual_b.flip([1]))

To preserve comprehension of regular amino acids, we trained our new PTM-Mamba model as a head to the state-of-the-art ESM-2-650M model⁷, in which wild-type amino acid tokens are passed into ESM-2-650M to retrieve its output embeddings and PTM tokens are converted into <mask> tokens for ESM-2-650M input (Fig. 1a). Sequences are finally fed into the embedding layer of PTM-Mamba, which naturally processes both wild-type and PTM tokens. To join the ESM-2-650M and PTM-Mamba embeddings, we propose a new gating mechanism in which the two embeddings are concatenated and filtered via a sigmoid-activated linear gate to produce a final output representation (Fig. 1a and code snippet below).

def gated_fuse(input_ids, esm_embedding): ptm_mamba_embedding = Embedding(input_ids) gate = Linear(torch.cat([hidden_states, esm_embedding], dim=-1)).sigmoid() hidden_states = ptm_mamba_embedding * gate + esm_embedding * (1 - gate) return hidden_states

We compared PTM-Mamba to a baseline PTM-Transformer model and observed faster convergence on training accuracy (Supplementary Fig. 3), highlighting the comparative efficiency of the bidirectional Mamba blocks and gating mechanism. Beyond efficiency, the primary objective of PTM-Mamba is to distinctly, yet relevantly, represent both unmodified and post-translationally modified sequences, capturing the critical biological functions and structural changes induced by PTMs. To assess this capability, we visualized PTM-Mamba embeddings using t-distributed stochastic neighbor embedding (t-SNE). The embeddings revealed a nuanced distinction between wild-type protein sequences and their PTM modified counterparts, with embeddings for each wild-type pair in close proximity (Fig. 1b). This suggests the ability of PTM-Mamba to capture the subtle yet notable effects of PTMs while maintaining the contextual integrity of the protein sequence. Additionally, token embeddings for PTM residues showed class-specific organization, with spatial proximity observed among tokens for phosphorylation and acetylation as examples (Fig. 1c). PTM residue tokens also exhibited greater spatial diversity than wild-type tokens, reflecting the model’s focus on encoding PTM-specific information (Fig. 1c).

To confirm that PTM-Mamba embeddings maintain strong performance on standard PTM prediction tasks, we evaluated them on phosphorylation site prediction (Supplementary Fig. 4) and non-histone acetylation site prediction (Supplementary Fig. 5). Using curated datasets for both tasks, we conducted per-residue binary classification and compared PTM-Mamba embeddings against baselines, including ESM-2-650M, ESM-2-3B, PTM-Transformer and baseline one-hot embeddings. PTM-Mamba maintained comparable performance across all metrics, confirming that its embeddings retain general applicability for PTM-related tasks. Notably, these tasks do not explicitly represent PTM tokens, which aligns with the observation that PTM-Mamba is primarily optimized for use cases involving modified sequences, rather than wild-type-only benchmarks.

We next evaluated PTM-Mamba on three benchmarking tasks explicitly leveraging PTM tokenization: disease association prediction, druggability prediction and the effects of PTMs on protein–protein interactions (PPIs). For disease association prediction, we used a dataset curated from the dbPTM database¹⁷ that links PTMs to conditions such as cancer, neurodegenerative disorders and diabetes, with annotations sourced from databases such as PhosphoSitePlus, ActiveDriverDB and genome-wide association studies (GWAS) as well as manual curation^18,19. Druggability prediction assessed PTM sequences that influence therapeutic targetability, focusing on how modifications alter protein structure and accessibility of binding sites¹⁷. To evaluate the effects of PTMs on PPIs, we used the PTMint dataset, which annotates experimentally validated PTM-mediated regulatory roles, specifically whether a PTM induces or inhibits a PPI²⁰. For all tasks, wild-type sequences were mapped to PTM-Mamba’s dataset, with residues replaced by the corresponding PTMs for tokenization, while baseline models, including one-hot embeddings and ESM-2 embeddings, used wild-type sequences as input.

For disease association prediction, PTM-Mamba performs strongly versus baseline models, including ESM-2-650M and PTM-Transformer, demonstrating its ability to capture PTM-specific effects essential for identifying disease-associated proteins (Fig. 2a). Similarly, for druggability prediction, PTM-Mamba achieved robust performance, often exceeding baselines across key metrics such as F₁ score and Matthews correlation coefficient (MCC), highlighting its relevance for therapeutic design (Fig. 2b). For the key PTM effect on the PPI task, PTM-Mamba achieved the highest metrics among all models, including PTM-Transformer and PTM-SaProt, a novel baseline model that replaces ESM-2 with state-of-the-art, structure-aware SaProt protein LM embeddings²¹, indicating that sequence-focused models may capture PTM effects more optimally (Fig. 2c). This benchmark showcases PTM-Mamba’s ability to model complex regulatory dynamics mediated by PTMs, further highlighting its utility for biologically relevant downstream applications.

**Fig. 2: Performance evaluation of PTM-Mamba across diverse PTM-related tasks.**

Finally, we explored PTM-Mamba’s utility for zero-shot PTM discovery, a task of great biological relevance. By analyzing model logits for masked positions in wild-type sequences, PTM-Mamba accurately predicted plausible PTMs for specific residues, such as <phosphoserine> for serine in UniProt sequence Q02261 and <S-diacylglycerol cysteine> for cysteine in UniProt sequence Q4L7X2 (Fig. 2d). This capability offers PTM-Mamba as a tool for biologists to generate new insights into PTM biology without requiring additional training or labels.

In total, PTM-Mamba provides new opportunities for modeling and designing PTM-specific protein sequences, particularly via its ability to explicitly tokenize PTM modified proteoforms for applications ranging from disease mechanism studies to therapeutic design with enhanced targeting specificity. For future work, we plan to address the limited availability of experimentally validated PTM annotations by augmenting the training dataset using mass spectrometry-based PTM databases²². We also aim to explore structure prediction of PTM modified sequences as a new task that can leverage PTM-Mamba’s embeddings, alongside extending these embeddings to design PTM-specific binders that selectively target modified protein states^6,23,24. Together, by enabling PTM-aware modeling, PTM-Mamba has the potential to reshape proteome analysis and drive innovation in precision therapeutics.

Methods

Data curation

Model training data were curated from UniProt¹⁵. Specifically, 311,350 experimentally validated PTM records were collected from Swiss-Prot, and the PTM annotations of their proteins were mapped to their respective sequences to construct the new PTM sequences. The final dataset includes a total of 79,707 PTM sequences. Data curation code can be found at https://github.com/programmablebio/ptm-mamba/tree/main/ptm_data_preprocessing.

Datasets for the four benchmarking tasks were collected from the following sources. Phosphorylation site data were obtained from the corresponding ProteinBERT benchmark²⁵, originally derived from PhospoSitePlus¹⁸ and filtered for sequences between 256 and 512 amino acids in length, yielding a training set of 15,588 sequences, a validation set of 1,707 sequences and a testing set of 3,106 sequences. Non-histone acetylation site prediction was performed equivalently as described in prior literature, using the non-histone acetylation collection dataset²⁶. Druggability and disease association datasets were curated from the dbPTM database¹⁷. PPI data describing the effect of PTMs were curated from PTMint, which encompasses 2,477 nonredundant PTM sites in 1,169 proteins affecting 2,371 protein–protein pairs²⁰. In brief, wild-type sequences were mapped to corresponding entries in the PTM-Mamba dataset, and wild-type residues were replaced by the corresponding position-specific PTMs for tokenization by specified models. For all other baseline models trained with standard one-hot embeddings or ESM-2 embeddings, the corresponding wild-type sequence was used as input.

Tokenization

In our tokenization scheme, we use the standard set of amino acids tokens as described in ESM-2 (ref. ⁷). In addition to special tokens, the 20 wild-type amino acids tokens are as follows: D, N, E, K, V, Y, A, Q, M, I, T, L, R, F, G, C, S, P, H, W. We introduce new PTM tokens, corresponding to their unique specific UniProt annotations: <N-linked (GlcNAc…) asparagine>, <Pyrrolidone carboxylic acid>, <Phosphoserine>, <Phosphothreonine>, <N-acetylalanine>, <N-acetylmethionine>, <N6-acetyllysine>, <Phosphotyrosine>, <S-diacylglycerol cysteine>, <N6-(pyridoxal phosphate)lysine>, <N-acetylserine>, <N6-carboxylysine>, <N6-succinyllysine>, <S-palmitoyl cysteine>, <O-(pantetheine 4-phosphoryl)serine>, <Sulfotyrosine>, <O-linked (GalNAc…) threonine>, <Omega-N-methylarginine>, <N-myristoyl glycine>, <4-hydroxyproline>, <Asymmetric dimethylarginine>, <N5-methylglutamine>, <4-aspartylphosphate>, <S-geranylgeranyl cysteine>, <4-carboxyglutamate>. The top two most abundant PTM tokens are <N-linked (GlcNAc…) asparagine> and <Phosphoserine>. The full distribution of the PTM tokens is shown in Supplementary Fig. 1, and the full PTM tokens are presented in Supplementary Table 1. The wild-type amino acid tokens are then converted into embeddings by both ESM-2-650M and PTM-Mamba, while the PTM tokens are only processed by PTM-Mamba.

PTM-Mamba training procedure

PTM-Mamba was trained on an Nvidia 8xA100 DGX system with 640 GB of shared VRAM on an adjusted masked language modeling task, in which, rather than random 15% token masking, we bias masked to PTM residue tokens (Fig. 1d). Briefly, given a sequence with 80% probability, we perform standard 15% token masking, and, with 20% probability, we mask all the PTM tokens and randomly mask 15% of wild-type tokens. For training, we then consider a protein sequence with masked residues, where the model aims to predict the original tokens at these residue positions. Let x_i denote the original residue token at position i that has been masked, and let y_i denote the residue token predicted by the model for this position. The loss function L for masked language modeling can be defined as the negative log likelihood of the correct tokens given their masked inputs, summed over all masked positions N:

$$L=-\mathop{\sum }\limits_{i=1}^{N}\log P\left({x}_{i}|{x}_{{\rm{masked}}}\right).$$

$P({x}_{i}|{x}_{{\rm{masked}}})$ represents the probability of predicting the correct original token x_i at the masked position, given the masked input sequence x_masked. PTM-Mamba was trained via the Adam optimizer with no weight decay. The final PTM-Mamba model has 24 layers with hidden dimensions of 768. It was trained for 16,765 steps (425 epochs) at a constant learning rate of 0.0002 with a batch size of 256 and dynamic batching. Training sequences were randomly cropped to a maximal length of 1,024 or padded at the end to reach a length of 1,024. During training, we clustered the sequences by length and constructed the batches. The training batches were fed into the model, going from the shortest to the longest sequences. PTM-Mamba was trained on an Nvidia 8xA100 GPU DGX system with 640 GB of shared VRAM.

Benchmark model training

For all the benchmark tasks, we leverage the embeddings from pretrained PTM-Mamba and ESM-2 models and fine-tune a classification head on top of the embeddings. We extensively tuned the classification head architectures as well as the training hyperparameters for the benchmarks and have reported the optimal settings in Supplementary Codes 1–4 and Supplementary Table 2. For models trained on one-hot embeddings of wild-type input sequences, an nn.Embedding layer followed by a linear layer was used. All benchmark models were trained on an Nvidia 8xA100 GPU DGX system with 640 GB of shared VRAM. For robust performance comparison, we replicate each model (n = 5) and report the individual and average results. Models were evaluated using accuracy, precision, recall, F₁ score, MCC, AUROC and AUPRC metrics via scikit-learn²⁷.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All data needed to evaluate the conclusions are presented in the paper, tables and Supplementary Information and are further available at https://doi.org/10.5281/zenodo.14794992 (ref. ²⁸).

Code availability

PTM-Mamba, PTM-Transformer, PTM-SaProt, baseline model weights, training code and Python scripts for data preprocessing can be found at https://huggingface.co/ChatterjeeLab/PTM-Mamba and https://github.com/programmablebio/ptm-mamba.

References

Ramazi, S. & Zahiri, J. Posttranslational modifications in proteins: resources, tools and prediction methods. Database 2021, baab012 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhong, Q. et al. Protein posttranslational modifications in health and diseases: functions, regulatory mechanisms, and therapeutic implications. MedComm 4, e261 (2023).
Article CAS PubMed PubMed Central Google Scholar
Pan, S. & Chen, R. Pathological implication of protein post-translational modifications in cancer. Mol. Asp. Med. 86, 101097 (2022).
Article CAS Google Scholar
Rébé, C., Végran, F., Berger, H. & Ghiringhelli, F. STAT3 activation: a key factor in tumor immunoescape. JAKSTAT 2, e23010 (2013).
PubMed PubMed Central Google Scholar
Lin, W.-H. et al. STAT3 phosphorylation at Ser727 and Tyr705 differentially regulates the EMT–MET switch and cancer metastasis. Oncogene 40, 791–805 (2021).
Article CAS PubMed Google Scholar
Chen, T., Hong, L., Yudistyra, V., Vincoff, S. & Chatterjee, P. Generative design of therapeutics that bind and modulate protein states. Curr. Opin. Biomed. Eng. 28, 100496 (2023).
Article CAS Google Scholar
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Article CAS PubMed Google Scholar
Elnaggar, A. et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).
Article PubMed Google Scholar
Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).
Article CAS PubMed PubMed Central Google Scholar
Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13, 4348 (2022).
Article CAS PubMed PubMed Central Google Scholar
Brixi, G. et al. SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders. Commun. Biol. 6, 1081 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bhat, S. et al. De novo design of peptide binders to conformationally diverse targets with contrastive language modeling. Sci. Adv. 11, 4 (2025).
Article Google Scholar
Chen, T. et al. PepMLM: target sequence-conditioned generation of therapeutic peptide binders via span masked language modeling. Preprint at https://doi.org/10.48550/arXiv.2310.03842 (2023).
Chen, T., Zhang, Y. & Chatterjee, P. moPPIt: generation of motif-specific binders with protein language models. Preprint at bioRxiv https://doi.org/10.1101/2024.07.31.606098 (2024).
UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
Article Google Scholar
Gu, A. & Dao, T. Mamba: linear-time sequence modeling with selective state spaces. Preprint at https://doi.org/10.48550/arXiv.2312.00752 (2023).
Li, Z. et al. dbPTM in 2022: an updated database for exploring regulatory networks and functional associations of protein post-translational modifications. Nucleic Acids Res. 50, D471–D479 (2022).
Article CAS PubMed Google Scholar
Hornbeck, P. V. et al. PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res. 40, D261–D270 (2012).
Article CAS PubMed Google Scholar
Krassowski, M. et al. ActiveDriverDB: interpreting genetic variation in human and cancer genomes using post-translational modification sites and signaling networks (2021 update). Front. Cell Dev. Biol. 9, 626821 (2021).
Article PubMed PubMed Central Google Scholar
Hong, X. et al. PTMint database of experimentally verified PTM regulation on protein–protein interaction. Bioinformatics 39, btac823 (2023).
Article CAS PubMed Google Scholar
Su, J. et al. SaProt: protein language modeling with structure-aware vocabulary. In Proc. 12th International Conference on Learning Representations (ICLR, 2024).
Doll, S. & Burlingame, A. L. Mass spectrometry-based detection and assignment of protein posttranslational modifications. ACS Chem. Biol. 10, 63–71 (2015).
Article CAS PubMed Google Scholar
Hattori, T. & Koide, S. Next-generation antibodies for post-translational modifications. Curr. Opin. Struct. Biol. 51, 141–148 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hong, L. et al. Programmable protein stabilization with language model-derived peptide guides. Preprint at Res. Sq. https://doi.org/10.21203/rs.3.rs-4670386/v1 (2024).
Brandes, N., Ofer, D., Peleg, Y., Rappoport, N. & Linial, M. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics 38, 2102–2110 (2022).
Article CAS PubMed PubMed Central Google Scholar
Meng, L. et al. TransPTM: a transformer-based model for non-histone acetylation site prediction. Brief. Bioinform. 25, 3 (2024).
Article Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Peng, F. Z. et al. PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks. Zenodo https://doi.org/10.5281/zenodo.14794992 (2025).

Download references

Acknowledgements

We thank Mark III Systems for computing support. We further thank Y. Zhang and T. Chen for their insights related to the manuscript. We thank L. Hong for rendering the PTM-Mamba logo. The work was supported by a grant from the National Institute of General Medical Sciences (award 1R35GM1555282-01) to the laboratory of P.C.

Author information

Authors and Affiliations

Department of Biomedical Engineering, Duke University, Durham, NC, USA
Fred Zhangzhi Peng, Tong Chen, Sophia Vincoff & Pranam Chatterjee
School of Life Sciences, Westlake University, Hangzhou, China
Chentong Wang
Department of Computer Science, Duke University, Durham, NC, USA
Benjamin Schussheim & Pranam Chatterjee
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Pranam Chatterjee

Authors

Fred Zhangzhi Peng
View author publications
Search author on:PubMed Google Scholar
Chentong Wang
View author publications
Search author on:PubMed Google Scholar
Tong Chen
View author publications
Search author on:PubMed Google Scholar
Benjamin Schussheim
View author publications
Search author on:PubMed Google Scholar
Sophia Vincoff
View author publications
Search author on:PubMed Google Scholar
Pranam Chatterjee
View author publications
Search author on:PubMed Google Scholar

Contributions

F.Z.P. designed and implemented PTM-Mamba architecture and conducted benchmarking analysis. F.Z.P. and C.W. developed and trained PTM-SaProt. T.C., S.V. and B.S. conducted benchmarking analysis. F.Z.P., S.V., T.C. and P.C. wrote and reviewed the manuscript. P.C. conceived, designed, directed and supervised the study.

Corresponding author

Correspondence to Pranam Chatterjee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–5, Codes 1–4 and Tables 1 and 2

Reporting Summary

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Peng, F.Z., Wang, C., Chen, T. et al. PTM-Mamba: a PTM-aware protein language model with bidirectional gated Mamba blocks. Nat Methods 22, 945–949 (2025). https://doi.org/10.1038/s41592-025-02656-9

Download citation

Received: 27 February 2024
Accepted: 05 March 2025
Published: 10 April 2025
Issue date: May 2025
DOI: https://doi.org/10.1038/s41592-025-02656-9