Abstract
Deep learning has revolutionized chemical research by accelerating the discovery and understanding of complex chemical systems. However, polymer chemistry lacks a unified deep learning framework owing to the complexity of polymer structures. Existing self-supervised learning methods simplify polymers into repeating units and neglect their inherent periodicity, thereby limiting the models’ ability to generalize across tasks. To address this, we propose a periodicity-aware deep learning framework for polymers, PerioGT. In pre-training, a chemical knowledge-driven periodicity prior is constructed and incorporated into the model through contrastive learning. Then, periodicity prompts are learned in fine-tuning based on the prior. Additionally, a graph augmentation strategy is employed, which integrates additional conditions via virtual nodes to model complex chemical interactions. PerioGT achieves state-of-the-art performance on 16 downstream tasks. Wet-lab experiments highlight PerioGT’s potential in the real world, identifying two polymers with potent antimicrobial properties. Our results demonstrate that introducing the periodicity prior effectively enhances model performance.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The pre-training dataset is available via GitHub at https://github.com/RUIMINMA1996/PI1M (ref. 25). The Tg, Tm and ρ datasets can be downloaded from ref. 51, and the Egc, Egb, Eat, Ei, Eea, nc and ε0 datasets are available via The Georgia Institute of Technology at https://khazana.gatech.edu/dataset/ (ref. 68). The EA and IP datasets are available via GitHub at https://github.com/coleygroup/polymer-chemprop-data (ref. 30). The OPV dataset is available via ACS at https://pubs.acs.org/doi/suppl/10.1021/acs.jpclett.8b00635/suppl_file/jz8b00635_si_002.txt (ref. 69). The PE dataset is available via UC Santa Barbara at https://pedatamine.org/ (ref. 70). The MAR1 and MAR2 datasets are available via Zenodo at https://doi.org/10.5281/zenodo.17035498 (ref. 71). These data are available via GitHub at https://github.com/wuyuhui-zju/PerioGT. The pre-trained, fine-tuned checkpoints and the processed datasets are available via Zenodo at https://doi.org/10.5281/zenodo.17035498 (ref. 71). Source data are provided with this paper.
Code availability
The code of PerioGT is available via GitHub at https://github.com/wuyuhui-zju/PerioGT and via Zenodo at https://doi.org/10.5281/zenodo.17035498 (ref. 71).
References
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).
Zhang, H. et al. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature 621, 396–403 (2023).
Rinehart, N. l. A machine-learning tool to predict substrate-adaptive conditions for Pd-catalyzed C–N couplings. Science 381, 965–972 (2023).
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Patra, T. K. Data-driven methods for accelerating polymer design. ACS Polym. Au 2, 8–26 (2021).
Martin, T. B. & Audus, D. J. Emerging trends in machine learning: a polymer perspective. ACS Polym. Au 3, 239–258 (2023).
Struble, D. C., Lamb, B. G. & Ma, B. A prospective on machine learning challenges, progress, and potential in polymer science. MRS Commun. 14, 752–770 (2024).
Ge, W., De Silva, R., Fan, Y., Sisson, S. A. & Stenzel, M. H. Machine learning in polymer research. Adv. Mater. 37, 2413695 (2025).
Yang, J., Tao, L., He, J., McCutcheon, J. R. & Li, Y. Machine learning enables interpretable discovery of innovative polymers for gas separation membranes. Sci. Adv. 8, eabn9545 (2022).
Tao, L., Varshney, V. & Li, Y. Benchmarking machine learning models for polymer informatics: an example of glass transition temperature. J. Chem. Inf. Model. 61, 5395–5413 (2021).
Arora, A. et al. Random forest predictor for diblock copolymer phase behavior. ACS Macro Lett. 10, 1339–1345 (2021).
Tao, L. et al. Discovery of multi-functional polyimides through high-throughput screening using explainable machine learning. Chem. Eng. J. 465, 142949 (2023).
Li, H. et al. Machine learning-accelerated discovery of heat-resistant polysulfates for electrostatic energy storage. Nat. Energy 10, 90–100 (2025).
Sun, W. et al. Machine learning–assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials. Sci. Adv. 5, eaay4275 (2019).
Meenakshisundaram, V., Hung, J.-H., Patra, T. K. & Simmons, D. S. Designing sequence-specific copolymer compatibilizers using a molecular-dynamics-simulation-based genetic algorithm. Macromolecules 50, 1155–1166 (2017).
Kim, C., Chandrasekaran, A., Huan, T. D., Das, D. & Ramprasad, R. Polymer genome: a data-powered polymer informatics platform for property predictions. J. Phys. Chem. C. 122, 17575–17585 (2018).
Gong, D. et al. Machine learning guided structure function predictions enable in silico nanoparticle screening for polymeric gene delivery. Acta Biomater. 154, 349–358 (2022).
Patel, R. A., Borca, C. H. & Webb, M. A. Featurization strategies for polymer sequence or composition design by machine learning. Mol. Syst. Des. Eng. 7, 661–676 (2022).
Tamasi, M. J. et al. Machine learning on a robotic platform for the design of polymer-protein hybrids. Adv. Mater. 34, 12 (2022).
Zhang, X. Y. et al. Polymer-unit fingerprint (PUFp): an accessible expression of polymer organic semiconductors for machine learning. ACS Appl. Mater. Interfaces 15, 21537–21548 (2023).
Tropsha, A., Isayev, O., Varnek, A., Schneider, G. & Cherkasov, A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat. Rev. Drug Discov. 23, 141–155 (2024).
Webb, M. A., Jackson, N. E., Gil, P. S. & de Pablo, J. J. Targeted sequence design within the coarse-grained polymer genome. Sci. Adv. 6, eabc6216 (2020).
Tao, L., Byrnes, J., Varshney, V. & Li, Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 25, 104585 (2022).
Ma, R. & Luo, T. PI1M: a benchmark database for polymer informatics. J. Chem. Inf. Model. 60, 4684–4690 (2020).
Miccio, L. A. & Schwartz, G. A. From chemical structure to quantitative polymer properties prediction through convolutional neural networks. Polymer 193, 122341 (2020).
Yan, C., Feng, X. M., Wick, C., Peters, A. & Li, G. Q. Machine learning assisted discovery of new thermoset shape memory polymers based on a small training dataset. Polymer 214, 12 (2021).
Zang, X., Zhao, X. & Tang, B. Hierarchical molecular graph self-supervised learning for property prediction. Commun. Chem. 6, 34 (2023).
Antoniuk, E. R., Li, P., Kailkhura, B. & Hiszpanski, A. M. Representing polymers as periodic graphs with learned descriptors for accurate polymer property predictions. J. Chem. Inf. Model. 62, 5435–5445 (2022).
Aldeghi, M. & Coley, C. W. A graph representation of molecular ensembles for polymer property prediction. Chem. Sci. 13, 10486–10498 (2022).
Zhang, S. et al. Deep learning-assisted design of novel donor–acceptor combinations for organic photovoltaic materials with enhanced efficiency. Adv. Mater. 37, 2407613 (2025).
Gurnani, R. et al. AI-assisted discovery of high-temperature dielectrics for energy storage. Nat. Commun. 15, 6107 (2024).
Park, J. et al. Prediction and interpretation of polymer properties using the graph convolutional network. ACS Polym. Au 2, 213–222 (2022).
Chen, D. et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proc. AAAI Conference on Artificial Intelligence 3438–3445 (AAAI Press, 2020).
Wang, Y., Wang, J., Cao, Z. & Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4, 279–287 (2022).
Zemin, L. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Qiang, B. et al. Bridging the gap between chemical reaction pretraining and conditional molecule generation with a unified model. Nat. Mach. Intell. 5, 1476–1485 (2023).
Kuenneth, C. & Ramprasad, R. polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics. Nat. Commun. 14, 4099 (2023).
Xu, C., Wang, Y. & Barati Farimani, A. TransPolymer: a transformer-based language model for polymer property predictions. npj Comput. Mater. 9, 64 (2023).
Lin, T.-S. et al. BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent. Sci. 5, 1523–1531 (2019).
Lin, T. S., Rebello, N. J., Lee, G. H., Morris, M. A. & Olsen, B. D. Canonicalizing BigSMILES for polymers with defined backbones. ACS Polym. Au 2, 486–500 (2022).
Schneider, L., Walsh, D., Olsen, B. & de Pablo, J. Generative BigSMILES: an extension for polymer informatics, computer simulations & ML/AI. Digit. Discov. 3, 51–61 (2024).
Luo, Y. et al. Masked graph modeling with multi-view contrast. In Proc. 40th International Conference on Data Engineering 2584–2597 (IEEE, 2024).
Tan, H., Lei, J., Wolf, T. & Bansal, M. Vimpac: video pre-training via masked token prediction and contrastive learning. Preprint at https://www.arxiv.org/abs/2106.11250 (2021).
Chaitanya, K., Erdil, E., Karani, N. & Konukoglu, E. Contrastive learning of global and local features for medical image segmentation with limited annotations. Adv. Neural Inf. Process. Syst. 33, 12546–12558 (2020).
Moriwaki, H., Tian, Y.-S., Kawashita, N. & Takagi, T. Mordred: a molecular descriptor calculator. J. Cheminform. 10, 4 (2018).
RDKit: open-source cheminformatics (RDKit, 2021); http://www.rdkit.org
Pengfei, L. et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55, 1–35 (2021).
Taoran, F., Yunchao, Z., Yang, Y., Chunping, W. & Lei, C. Universal prompt tuning for graph neural networks. Adv. Neural Inf. Process. Syst. 36, 52464–52489 (2023).
Fang, Y. et al. Knowledge graph-enhanced molecular contrastive learning with functional prompt. Nat. Mach. Intell. 5, 542–553 (2023).
Liu, G., Zhao, T., Xu, J., Luo, T. & Jiang, M. Graph rationalization with environment-based augmentations. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 1069–1078 (ACM, 2022).
Wang, T. & Isola, P. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proc. 37th International Conference on Machine Learning 9929–9939 (PMLR, 2020).
Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
Mookherjee, N., Anderson, M. A., Haagsman, H. P. & Davidson, D. J. Antimicrobial host defence peptides: functions and clinical potential. Nat. Rev. Drug Discov. 19, 311–332 (2020).
Huang, J. et al. Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences. Nat. Biomed. Eng. 7, 797–810 (2023).
Shabani, S. et al. Synthetic peptide branched polymers for antibacterial and biomedical applications. Nat. Rev. Bioeng. 2, 343–361 (2024).
Zhou, M. et al. A dual-targeting antifungal is effective against multidrug-resistant human fungal pathogens. Nat. Microbiol. 9, 1325–1339 (2024).
Phuong, P. T. et al. Effect of hydrophobic groups on antimicrobial and hemolytic activity: developing a predictive tool for ternary antimicrobial polymers. Biomacromolecules 21, 5241–5255 (2020).
Furka, Á. Forty years of combinatorial technology. Drug Discov. Today 27, 103308 (2022).
Bai, P., Liu, X. & Lu, H. Geometry-aware line graph transformer pre-training for molecular property prediction. Preprint at https://www.arxiv.org/abs/2309.00483 (2023).
Li, H., Zhao, D. & Zeng, J. KPGT: knowledge-guided pre-training of graph transformer for molecular property prediction. In Proc. 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 857–867 (ACM, 2022).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (ACL, 2019).
He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729–9738 (IEEE, 2020).
Ying, C. et al. Do Transformers really perform bad for graph representation? Adv. Neural Inf. Process. Syst. 34, 28877–28888 (2021).
Rampášek, L. et al. Recipe for a general, powerful, scalable graph transformer. Adv. Neural Inf. Process. Syst. 35, 14501–14515 (2022).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Corso, G., Cavalleri, L., Beaini, D., Liò, P. & Veličković, P. Principal neighbourhood aggregation for graph nets. Adv. Neural Inf. Process. Syst. 33, 13260–13271 (2020).
Kuenneth, C. et al. Polymer informatics with multi-task learning. Patterns 2, 100238 (2021).
Nagasawa, S., Al-Naamani, E. & Saeki, A. Computer-aided screening of conjugated polymers for organic solar cell: classification by random forest. J. Phys. Chem. Lett. 9, 2639–2646 (2018).
Schauser, N. S., Kliegle, G. A., Cooke, P., Segalman, R. A. & Seshadri, R. Database creation, visualization, and statistical learning for polymer Li+-electrolyte design. Chem. Mater. 33, 4863–4876 (2021).
Wu, Y. Datasets and checkpoints for PerioGT. Zenodo https://doi.org/10.5281/zenodo.17035498 (2025).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning 1263–1272 (PMLR, 2017).
Veličković, P. et al. Graph attention networks. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).
Acknowledgements
This work was supported by the National Natural Science Foundation of China (grant no. 52293381), the Zhejiang Provincial Natural Science Foundation of China (grant no. LR25E030001) and the National Key Research and Development Program of China (grant no. 2022YFB3807300). This work was also supported by the Transvascular Implantation Devices Research Institute China under grant nos. KY012024007 and KY012024009.
Author information
Authors and Affiliations
Contributions
Y.W. conceived the main idea and conducted the in silico experiments. C.W. was responsible for chemical synthesis and biological characterization. Y.W. and C.W. wrote the paper together. X.S. and T.Z. participated in discussions and provided many suggestions for the wet-lab experiments. P.Z. and J.J. guided the whole project. All authors reviewed and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Jacob Gissinger and Boran Ma for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Application of PerioGT in antimicrobial polymer discovery.
a, Pairwise combinations of diacrylates (pink) and amines (blue) generated polymers via Michael addition. A subset of 150 polymers was synthesized and labeled for antimicrobial activity. PerioGT was fine-tuned to predict unlabeled products, and the top 30 candidates with highest predicted activity were selected for validation. b, Distribution of MIC in training set and top 30 candidates by each model (n = 3 biologically independent replicates). The contour shows the kernel density estimation, with a white dot for the median, a thick bar for the interquartile range, and thin lines for the 95% confidence intervals. Lower MIC indicates better antimicrobial activity. The part above the dashed line indicates no antimicrobial activity (MIC > 64 μg ml−1). c, The two polymers with the lowest MIC (8 μg ml−1) predicted by PerioGT and evaluated by wet-lab experiments. d, Live/dead staining assay. The assay was performed using N01 to label live cells (green) and PI (propidium iodide) to label dead cells (red). Scale bar: 20 μm. e, MRSA colony counting after incubation with polymer 1 and 2 for 9 h. Data are presented as mean values ± s.d., n = 3 biologically independent replicates. Lower values indicate fewer surviving bacterial colonies. f, Membrane potential disturbance induced by different treatments. Phosphate-buffered saline (PBS) and Triton X-100 (TX) were included as negative and positive controls, respectively. g, TEM characterization of MRSA after incubation with 1 and 2 for 5 min. Scale bar: 200 nm. Panel a created with BioRender.com.
Supplementary information
Supplementary Information (download PDF )
Supplementary Notes 1–3, Figs. 1–6 and Tables 1–11.
Supplementary Data 1 (download XLSX )
The polymer libraries and building blocks constructed in the case study.
Source data
Source Data Fig. 2 (download XLSX )
Source data for Fig. 2.
Source Data Fig. 3 (download XLSX )
Source data for Fig. 3.
Source Data Fig. 4 (download XLSX )
Source data for Fig. 4.
Source Data Extended Data Fig. 1 (download ZIP )
Source data for Extended Data Fig. 1.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, Y., Wang, C., Shen, X. et al. Periodicity-aware deep learning for polymers. Nat Comput Sci 5, 1214–1226 (2025). https://doi.org/10.1038/s43588-025-00903-9
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s43588-025-00903-9


