Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning

Nguyen, Chi-Hieu; Dinh, Thai Hoang; Nguyen, Diep N.; Lauter, Kristin; Kim, Miran

doi:10.1038/s42256-025-01135-2

Article
Published: 01 December 2025

Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning

Nature Machine Intelligence volume 7, pages 1913–1926 (2025)Cite this article

1728 Accesses
1 Citations
19 Altmetric
Metrics details

Subjects

Abstract

Deep reinforcement learning (DRL) demonstrates significant potential in solving complex control and decision-making problems, but it may inadvertently expose sensitive, environment-specific information, raising privacy and security concerns for computer systems, humans and organizations. This work introduces a privacy-preserving framework using homomorphic encryption and advanced learning algorithms to secure DRL processes. Our framework enables the encryption of sensitive information, including states, actions and rewards, before sharing it with an untrusted processing platform. This encryption ensures data privacy, prevents unauthorized access and maintains compliance with data protection laws throughout the learning process. In addition, we develop innovative algorithms to efficiently handle a wide range of encrypted control tasks. Our core innovation is the homomorphic encryption-compatible Adam optimizer, which reparameterizes momentum values to bypass the need for high-degree polynomial approximations of inverse square roots on encrypted data. This adaptation, previously unexplored in homomorphic encryption-based ML research, enables stable and efficient training with adaptive learning rates in encrypted domains, addressing a critical bottleneck for privacy-preserving DRL with sparse rewards. Evaluations on standard DRL benchmarks demonstrate that our encrypted DRL performs comparably with its unencrypted counterpart (with a gap of less than 10%) and maintaining data confidentiality with homomorphic encryption. This work facilitates the integration of privacy-preserving DRL into real-world applications, addressing critical privacy concerns, and promoting the ethical advancement of artificial intelligence.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Workflow of our proposed privacy-preserving DRL framework.**

**Fig. 2: Overview of the proposed HE-based SAC framework with polynomial approximations and computation map optimization.**

**Fig. 3: Learning performance across RL environments.**

**Fig. 4: Illustration of different interactive HE-SAC variants.**

Design of an improved model using federated learning and LSTM autoencoders for secure and transparent blockchain network transactions

Article Open access 10 January 2025

Towards compute-efficient Byzantine-robust federated learning with fully homomorphic encryption

Article 08 September 2025

Exploring spiking neural networks for deep reinforcement learning in robotic tasks

Article Open access 28 December 2024

Data availability

All data used in this study are publicly available. The Pendulum and CartPole environments are available via GitHub (https://github.com/openai/gym) and the MobileEnv environment is also available via GitHub (https://github.com/stefanbschneider/mobile-env).

Code availability

The software code supporting the findings of the secure SAC algorithm in this study is available via GitHub (https://github.com/hieunch/PPRL) and via Zenodo (https://doi.org/10.5281/zenodo.17038255)⁶⁴.

References

Huang, Z., Wu, J. & Lv, C. Efficient deep reinforcement learning with imitative expert priors for autonomous driving. IEEE Trans. Neural Netw. Learn. Syst. 33, 3675–3684 (2022).
Feng, S. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023).
Article Google Scholar
Hachem, E. et al. Reinforcement learning for patient-specific optimal stenting of intracranial aneurysms. Sci. Rep. 13, 7147 (2023).
Article Google Scholar
Hu, M., Zhang, J., Matkovic, L., Liu, T. & Yang, X. Reinforcement learning in medical image analysis: concepts, applications, challenges, and future directions. J. Appl. Clin. Med. Phys. 24, 13898 (2023).
Article Google Scholar
Lei, Y. et al. New challenges in reinforcement learning: a survey of security and privacy. Artif. Intell. Rev. 56, 7195–7236 (2023).
Article Google Scholar
Mo, K. et al. Security and privacy issues in deep reinforcement learning: threats and countermeasures. ACM Comput. Surv. 56, 152 (2024).
Article Google Scholar
Pan, X. et al. How you act tells a lot: privacy-leaking attack on deep reinforcement learning. In Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 368–376 (ACM, 2019).
Vietri, G., Balle, B., Krishnamurthy, A. & Wu, S. Private reinforcement learning with PAC and regret guarantees. In Proc. International Conference on Machine Learning 9754–9764 (PMLR, 2020).
Garcelon, E., Perchet, V., Pike-Burke, C. & Pirotta, M. Local differential privacy for regret minimization in reinforcement learning. Adv. Neural Inf. Process. Syst. 34, 10561–10573 (2021).
Google Scholar
Chowdhury, S. R. & Zhou, X. Differentially private regret minimization in episodic Markov decision processes. In Proc. AAAI Conference on Artificial Intelligence 36, 6375–6383 (AAAI, 2022).
Qiao, D. & Wang, Y.-X. Near-optimal differentially private reinforcement learning. In Proc. International Conference on Artificial Intelligence and Statistics 9914–9940 (PMLR, 2023).
Jesu, A., Darvariu, V.-A., Staffolani, A., Montanari, R. & Musolesi, M. Reinforcement learning on encrypted data. Preprint at https://doi.org/10.48550/arXiv.2109.08236 (2021).
Knott, B. et al. Crypten: secure multi-party computation meets machine learning. Adv. Neural Inf. Process. Syst. 34, 4961–4973 (2021).
Google Scholar
Tan, S., Knott, B., Tian, Y. & Wu, D. J. CryptGPU: fast privacy-preserving machine learning on the GPU. In Proc. IEEE Symposium on Security and Privacy (SP) 1021–1038 (IEEE, 2021).
Rathee, D., Bhattacharya, A., Gupta, D., Sharma, R. & Song, D. Secure floating-point training. In Proc. 32nd USENIX Security Symposium 6329–6346 (USENIX Association, 2023).
Wang, S. et al. Privacy-aware estimation of relatedness in admixed populations. Brief. Bioinform. 23, 473 (2022).
Article Google Scholar
Li, W. et al. COLLAGENE enables privacy-aware federated and collaborative genomic data analysis. Genome Biol. 24, 204 (2023).
Article Google Scholar
Nandakumar, K., Ratha, N., Pankanti, S. & Halevi, S. Towards deep neural network training on encrypted data. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 40–48 (IEEE, 2019).
Al Badawi, A., Hoang, L., Mun, C. F., Laine, K. & Aung, K. M. M. Privft: private and fast text classification with homomorphic encryption. IEEE Access 8, 226544–226556 (2020).
Article Google Scholar
Kim, M., Jiang, X., Lauter, K., Ismayilzada, E. & Shams, S. Secure human action recognition by encrypted neural network inference. Nat. Commun. 13, 4799 (2022).
Article Google Scholar
Nguyen, C. et al. Encrypted data caching and learning framework for robust federated learning-based mobile edge computing. IEEE/ACM Trans. Netw. 32, 2705–2720 (2024).
Article Google Scholar
Cheon, J. H., Kim, A., Kim, M. & Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proc. 23rd International Conference on the Theory and Applications of Cryptology and Information Security 409–437 (Springer, 2017).
Cheon, J. H., Han, K., Kim, A., Kim, M. & Song, Y. Bootstrapping for approximate homomorphic encryption. In Proc. 37th Annual International Conference on the Theory and Applications of Cryptographic Techniques 360–384 (Springer, 2018).
Cheon, J. H., Han, K., Kim, A., Kim, M. & Song, Y. A full RNS variant of approximate homomorphic encryption. In Proc. 25th International Conference on Selected Areas in Cryptography 347–368 (Springer, 2019).
Kim, M., Lee, D., Seo, J. & Song, Y. Accelerating HE operations from key decomposition technique. In Proc. Annual International Cryptology Conference 70–92 (Springer, 2023).
Lee, S., Lee, G., Kim, J. W., Shin, J. & Lee, M.-K. HETAL: efficient privacy-preserving transfer learning with homomorphic encryption. In Proc. International Conference on Machine Learning 19010–19035 (PMLR, 2023).
Crockett, E. A low-depth homomorphic circuit for logistic regression model training. In Proc. 8th Workshop on Encrypted Computing & Applied Homomorphic Cryptography (ACM, 2020).
Jin, C., Ragab, M. & Aung, K. M. M. Secure transfer learning for machine fault diagnosis under different operating conditions. In Proc. International Conference on Provable Security 278–297 (Springer, 2020).
Liu, X., Deng, R. H., Choo, K.-K. R. & Yang, Y. Privacy-preserving reinforcement learning design for patient-centric dynamic treatment regimes. IEEE Trans. Emerg. Top. Comput. 9, 456–470 (2019).
Article Google Scholar
Park, J., Kim, D. S. & Lim, H. Privacy-preserving reinforcement learning using homomorphic encryption in cloud computing infrastructures. IEEE Access 8, 203564–203579 (2020).
Article Google Scholar
Haarnoja, T. et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proc. 35th International Conference on Machine Learning 1861–1870 (PMLR, 2018).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (ICLR, 2015).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://doi.org/10.48550/arXiv.1707.06347 (2017).
Mnih, V. et al. Asynchronous methods for deep reinforcement learning. In Proc. International Conference on Machine Learning 1928–1937 (PMLR, 2016).
Lyubashevsky, V., Peikert, C. & Regev, O. On ideal lattices and learning with errors over rings. In Proc. 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques 1–23 (Springer, 2010).
Li, B. & Micciancio, D. On the security of homomorphic encryption on approximate numbers. In Proc. Annual International Conference on the Theory and Applications of Cryptographic Techniques 648–677 (Springer, 2021).
Al Badawi, A. et al. OpenFHE: open-source fully homomorphic encryption library. In Proc. 10th Workshop on Encrypted Computing & Applied Homomorphic Cryptography 53–63 (ACM, 2022).
Boemer, F., Kim, S., Seifu, G., Souza, F. & Gopal, V. Intel HEXL: accelerating homomorphic encryption with Intel AVX512-IFMA52. In Proc. 9th Workshop on Encrypted Computing & Applied Homomorphic Cryptography 57–62 (ACM, 2021).
Albrecht, M. R., Player, R. & Scott, S. On the concrete hardness of learning with errors. J. Math. Cryptol. 9, 169–203 (2015).
Article MathSciNet Google Scholar
Raffin, A. et al. Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 22, 12348–12355 (2021).
Google Scholar
Keller, M. & Sun, K. Secure quantized training for deep learning. In Proc. International Conference on Machine Learning 10912–10938 (PMLR, 2022).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Article Google Scholar
Roy, S. S., Turan, F., Järvinen, K., Vercauteren, F. & Verbauwhede, I. FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In Proc. IEEE International Symposium on High Performance Computer Architecture 387–398 (IEEE, 2019).
Turan, F., Roy, S. S. & Verbauwhede, I. HEAWS: an accelerator for homomorphic encryption on the Amazon AWS FPGA. IEEE Trans. Comput. 69, 1185–1196 (2020).
Google Scholar
Kim, S. et al. BTS: an accelerator for bootstrappable fully homomorphic encryption. In Proc. 49th Annual International Symposium on Computer Architecture 711–725 (ACM, 2022).
Geelen, R. et al. BASALISC: Programmable hardware accelerator for BGV fully homomorphic encryption. In IACR Trans. Cryptogr. Hardw. Embed. Syst. 2023, 32–57 (2023).
Hao, M. et al. Iron: private inference on transformers. Adv. Neural Inf. Process. Syst. 35, 15718–15731 (2022).
Google Scholar
Pang, Q., Zhu, J., Möllering, H., Zheng, W. & Schneider, T. BOLT: privacy-preserving, accurate and efficient inference for transformers. In Proc. IEEE Symposium on Security and Privacy 4753–4771 (IEEE, 2024).
Lee, E. et al. Low-complexity deep convolutional neural networks on fully homomorphic encryption using multiplexed parallel convolutions. In Proc. International Conference on Machine Learning 12403–12422 (PMLR, 2022).
Mouchet, C., Troncoso-Pastoriza, J., Bossuat, J.-P. & Hubaux, J.-P. Multiparty homomorphic encryption from ring-learning-with-errors. Proc. Priv. Enhancing Technol. 2021, 291–311 (2021).
Article Google Scholar
Rakhsha, A., Radanovic, G., Devidze, R., Zhu, X. & Singla, A. Policy teaching via environment poisoning: training-time adversarial attacks against reinforcement learning. In Proc. International Conference on Machine Learning 7974–7984 (PMLR, 2020).
Rakhsha, A., Zhang, X., Zhu, X. & Singla, A. Reward poisoning in reinforcement learning: attacks against unknown learners in unknown environments. Preprint at https://doi.org/10.48550/arXiv.2102.08492 (2021).
Cascudo, I. et al. Verifiable computation for approximate homomorphic encryption schemes. In Proc. Annual International Cryptology Conference 643–677 (Springer, 2025).
Santriaji, M. H., Xue, J., Zhang, Y., Lou, Q. & Solihin, Y. DataSeal: ensuring the verifiability of private computation on encrypted data. In Proc. IEEE Symposium on Security and Privacy 2378–2394 (IEEE, 2025).
Gentry, C. A Fully Homomorphic Encryption Scheme (Stanford Univ., 2009).
Google Scholar
Brakerski, Z., Gentry, C. & Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory 6, 13 (2014).
Chillotti, I., Gama, N., Georgieva, M. & Izabachene, M. Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In Proc. 22nd International Conference on the Theory and Application of Cryptology and Information Security 3–33 (Springer, 2016).
Kim, A., Song, Y., Kim, M., Lee, K. & Cheon, J. H. Logistic regression model training based on the approximate homomorphic encryption. BMC Med. Genomics 11, 23–31 (2018).
Article Google Scholar
Ramachandran, P., Zoph, B. & Le, Q. V. Searching for activation functions. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).
Misra, D. Mish: A self-regularized non-monotonic activation function. In Proc. 31st British Machine Vision Conference (BMVA, 2020).
Adams, R. A. & Fournier, J. J. Sobolev Spaces (Elsevier, 2003).
Google Scholar
Rimes, E. Sur le calcul effectif des polynômes d’approximation de Tchebycheff. C. R. Acad. Sci. Paris 199, 337–340 (1934).
Bossuat, J.-P., Mouchet, C., Troncoso-Pastoriza, J. & Hubaux, J.-P. Efficient bootstrapping for approximate homomorphic encryption with non-sparse keys. In Proc. Annual International Conference on the Theory and Applications of Cryptographic Techniques 587–617 (Springer, 2021).
hieunch. hieunch/PPRL: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.17038255 (2025).

Download references

Acknowledgements

This research was partially funded by the Australia–Vietnam Strategic Technologies Centre and the Australian Government through the Australian Research Council’s Discovery Projects funding scheme (project DE210100651, T.H.D.). M.K. was supported by the National Human Genome Research Institute grant number R01HG012604; the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (numbers 2021R1C1C1010173 and RS-2023-00211649); Basic Science Research Program through the NRF funded by the Ministry of Education (2020R1A6A1A06046728); and Korea Basic Science Institute (National research Facilities and Equipment Center) grant funded by the Ministry of Education (2023R1A6C101A009).

Author information

Authors and Affiliations

School of Electrical and Data Engineering, University of Technology Sydney, Sydney, New South Wales, Australia
Chi-Hieu Nguyen, Thai Hoang Dinh & Diep N. Nguyen
Meta AI Research (FAIR), Seattle, WA, USA
Kristin Lauter
Department of Mathematics, Hanyang University, Seoul, Republic of Korea
Miran Kim

Authors

Chi-Hieu Nguyen
View author publications
Search author on:PubMed Google Scholar
Thai Hoang Dinh
View author publications
Search author on:PubMed Google Scholar
Diep N. Nguyen
View author publications
Search author on:PubMed Google Scholar
Kristin Lauter
View author publications
Search author on:PubMed Google Scholar
Miran Kim
View author publications
Search author on:PubMed Google Scholar

Contributions

T.H.D. conceived the original idea, which was subsequently refined and expanded by C.-H.N. and D.N.N. to develop the system model, initial framework and approach. M.K. and K.L. made substantial contributions to further enhance the solution and finalize the framework. All authors had key roles in designing the testing scenarios, conducting experiments and analysing the results. Additionally, all authors actively participated in drafting the manuscript and approved the final version for submission.

Corresponding authors

Correspondence to Chi-Hieu Nguyen or Miran Kim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary Notes 1 and 2, Supplementary Fig. 1, Supplementary Tables 1–3 and Supplementary References.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nguyen, CH., Dinh, T.H., Nguyen, D.N. et al. Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning. Nat Mach Intell 7, 1913–1926 (2025). https://doi.org/10.1038/s42256-025-01135-2

Download citation

Received: 24 February 2025
Accepted: 17 September 2025
Published: 01 December 2025
Version of record: 01 December 2025
Issue date: December 2025
DOI: https://doi.org/10.1038/s42256-025-01135-2