Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning

Abstract

Deep reinforcement learning (DRL) demonstrates significant potential in solving complex control and decision-making problems, but it may inadvertently expose sensitive, environment-specific information, raising privacy and security concerns for computer systems, humans and organizations. This work introduces a privacy-preserving framework using homomorphic encryption and advanced learning algorithms to secure DRL processes. Our framework enables the encryption of sensitive information, including states, actions and rewards, before sharing it with an untrusted processing platform. This encryption ensures data privacy, prevents unauthorized access and maintains compliance with data protection laws throughout the learning process. In addition, we develop innovative algorithms to efficiently handle a wide range of encrypted control tasks. Our core innovation is the homomorphic encryption-compatible Adam optimizer, which reparameterizes momentum values to bypass the need for high-degree polynomial approximations of inverse square roots on encrypted data. This adaptation, previously unexplored in homomorphic encryption-based ML research, enables stable and efficient training with adaptive learning rates in encrypted domains, addressing a critical bottleneck for privacy-preserving DRL with sparse rewards. Evaluations on standard DRL benchmarks demonstrate that our encrypted DRL performs comparably with its unencrypted counterpart (with a gap of less than 10%) and maintaining data confidentiality with homomorphic encryption. This work facilitates the integration of privacy-preserving DRL into real-world applications, addressing critical privacy concerns, and promoting the ethical advancement of artificial intelligence.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Workflow of our proposed privacy-preserving DRL framework.
Fig. 2: Overview of the proposed HE-based SAC framework with polynomial approximations and computation map optimization.
Fig. 3: Learning performance across RL environments.
Fig. 4: Illustration of different interactive HE-SAC variants.

Similar content being viewed by others

Data availability

All data used in this study are publicly available. The Pendulum and CartPole environments are available via GitHub (https://github.com/openai/gym) and the MobileEnv environment is also available via GitHub (https://github.com/stefanbschneider/mobile-env).

Code availability

The software code supporting the findings of the secure SAC algorithm in this study is available via GitHub (https://github.com/hieunch/PPRL) and via Zenodo (https://doi.org/10.5281/zenodo.17038255)64.

References

  1. Huang, Z., Wu, J. & Lv, C. Efficient deep reinforcement learning with imitative expert priors for autonomous driving. IEEE Trans. Neural Netw. Learn. Syst. 33, 3675–3684 (2022).

  2. Feng, S. et al. Dense reinforcement learning for safety validation of autonomous vehicles. Nature 615, 620–627 (2023).

    Article  Google Scholar 

  3. Hachem, E. et al. Reinforcement learning for patient-specific optimal stenting of intracranial aneurysms. Sci. Rep. 13, 7147 (2023).

    Article  Google Scholar 

  4. Hu, M., Zhang, J., Matkovic, L., Liu, T. & Yang, X. Reinforcement learning in medical image analysis: concepts, applications, challenges, and future directions. J. Appl. Clin. Med. Phys. 24, 13898 (2023).

    Article  Google Scholar 

  5. Lei, Y. et al. New challenges in reinforcement learning: a survey of security and privacy. Artif. Intell. Rev. 56, 7195–7236 (2023).

    Article  Google Scholar 

  6. Mo, K. et al. Security and privacy issues in deep reinforcement learning: threats and countermeasures. ACM Comput. Surv. 56, 152 (2024).

    Article  Google Scholar 

  7. Pan, X. et al. How you act tells a lot: privacy-leaking attack on deep reinforcement learning. In Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 368–376 (ACM, 2019).

  8. Vietri, G., Balle, B., Krishnamurthy, A. & Wu, S. Private reinforcement learning with PAC and regret guarantees. In Proc. International Conference on Machine Learning 9754–9764 (PMLR, 2020).

  9. Garcelon, E., Perchet, V., Pike-Burke, C. & Pirotta, M. Local differential privacy for regret minimization in reinforcement learning. Adv. Neural Inf. Process. Syst. 34, 10561–10573 (2021).

    Google Scholar 

  10. Chowdhury, S. R. & Zhou, X. Differentially private regret minimization in episodic Markov decision processes. In Proc. AAAI Conference on Artificial Intelligence 36, 6375–6383 (AAAI, 2022).

  11. Qiao, D. & Wang, Y.-X. Near-optimal differentially private reinforcement learning. In Proc. International Conference on Artificial Intelligence and Statistics 9914–9940 (PMLR, 2023).

  12. Jesu, A., Darvariu, V.-A., Staffolani, A., Montanari, R. & Musolesi, M. Reinforcement learning on encrypted data. Preprint at https://doi.org/10.48550/arXiv.2109.08236 (2021).

  13. Knott, B. et al. Crypten: secure multi-party computation meets machine learning. Adv. Neural Inf. Process. Syst. 34, 4961–4973 (2021).

    Google Scholar 

  14. Tan, S., Knott, B., Tian, Y. & Wu, D. J. CryptGPU: fast privacy-preserving machine learning on the GPU. In Proc. IEEE Symposium on Security and Privacy (SP) 1021–1038 (IEEE, 2021).

  15. Rathee, D., Bhattacharya, A., Gupta, D., Sharma, R. & Song, D. Secure floating-point training. In Proc. 32nd USENIX Security Symposium 6329–6346 (USENIX Association, 2023).

  16. Wang, S. et al. Privacy-aware estimation of relatedness in admixed populations. Brief. Bioinform. 23, 473 (2022).

    Article  Google Scholar 

  17. Li, W. et al. COLLAGENE enables privacy-aware federated and collaborative genomic data analysis. Genome Biol. 24, 204 (2023).

    Article  Google Scholar 

  18. Nandakumar, K., Ratha, N., Pankanti, S. & Halevi, S. Towards deep neural network training on encrypted data. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 40–48 (IEEE, 2019).

  19. Al Badawi, A., Hoang, L., Mun, C. F., Laine, K. & Aung, K. M. M. Privft: private and fast text classification with homomorphic encryption. IEEE Access 8, 226544–226556 (2020).

    Article  Google Scholar 

  20. Kim, M., Jiang, X., Lauter, K., Ismayilzada, E. & Shams, S. Secure human action recognition by encrypted neural network inference. Nat. Commun. 13, 4799 (2022).

    Article  Google Scholar 

  21. Nguyen, C. et al. Encrypted data caching and learning framework for robust federated learning-based mobile edge computing. IEEE/ACM Trans. Netw. 32, 2705–2720 (2024).

    Article  Google Scholar 

  22. Cheon, J. H., Kim, A., Kim, M. & Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proc. 23rd International Conference on the Theory and Applications of Cryptology and Information Security 409–437 (Springer, 2017).

  23. Cheon, J. H., Han, K., Kim, A., Kim, M. & Song, Y. Bootstrapping for approximate homomorphic encryption. In Proc. 37th Annual International Conference on the Theory and Applications of Cryptographic Techniques 360–384 (Springer, 2018).

  24. Cheon, J. H., Han, K., Kim, A., Kim, M. & Song, Y. A full RNS variant of approximate homomorphic encryption. In Proc. 25th International Conference on Selected Areas in Cryptography 347–368 (Springer, 2019).

  25. Kim, M., Lee, D., Seo, J. & Song, Y. Accelerating HE operations from key decomposition technique. In Proc. Annual International Cryptology Conference 70–92 (Springer, 2023).

  26. Lee, S., Lee, G., Kim, J. W., Shin, J. & Lee, M.-K. HETAL: efficient privacy-preserving transfer learning with homomorphic encryption. In Proc. International Conference on Machine Learning 19010–19035 (PMLR, 2023).

  27. Crockett, E. A low-depth homomorphic circuit for logistic regression model training. In Proc. 8th Workshop on Encrypted Computing & Applied Homomorphic Cryptography (ACM, 2020).

  28. Jin, C., Ragab, M. & Aung, K. M. M. Secure transfer learning for machine fault diagnosis under different operating conditions. In Proc. International Conference on Provable Security 278–297 (Springer, 2020).

  29. Liu, X., Deng, R. H., Choo, K.-K. R. & Yang, Y. Privacy-preserving reinforcement learning design for patient-centric dynamic treatment regimes. IEEE Trans. Emerg. Top. Comput. 9, 456–470 (2019).

    Article  Google Scholar 

  30. Park, J., Kim, D. S. & Lim, H. Privacy-preserving reinforcement learning using homomorphic encryption in cloud computing infrastructures. IEEE Access 8, 203564–203579 (2020).

    Article  Google Scholar 

  31. Haarnoja, T. et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proc. 35th International Conference on Machine Learning 1861–1870 (PMLR, 2018).

  32. Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (ICLR, 2015).

  33. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://doi.org/10.48550/arXiv.1707.06347 (2017).

  34. Mnih, V. et al. Asynchronous methods for deep reinforcement learning. In Proc. International Conference on Machine Learning 1928–1937 (PMLR, 2016).

  35. Lyubashevsky, V., Peikert, C. & Regev, O. On ideal lattices and learning with errors over rings. In Proc. 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques 1–23 (Springer, 2010).

  36. Li, B. & Micciancio, D. On the security of homomorphic encryption on approximate numbers. In Proc. Annual International Conference on the Theory and Applications of Cryptographic Techniques 648–677 (Springer, 2021).

  37. Al Badawi, A. et al. OpenFHE: open-source fully homomorphic encryption library. In Proc. 10th Workshop on Encrypted Computing & Applied Homomorphic Cryptography 53–63 (ACM, 2022).

  38. Boemer, F., Kim, S., Seifu, G., Souza, F. & Gopal, V. Intel HEXL: accelerating homomorphic encryption with Intel AVX512-IFMA52. In Proc. 9th Workshop on Encrypted Computing & Applied Homomorphic Cryptography 57–62 (ACM, 2021).

  39. Albrecht, M. R., Player, R. & Scott, S. On the concrete hardness of learning with errors. J. Math. Cryptol. 9, 169–203 (2015).

    Article  MathSciNet  Google Scholar 

  40. Raffin, A. et al. Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 22, 12348–12355 (2021).

    Google Scholar 

  41. Keller, M. & Sun, K. Secure quantized training for deep learning. In Proc. International Conference on Machine Learning 10912–10938 (PMLR, 2022).

  42. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article  Google Scholar 

  43. Roy, S. S., Turan, F., Järvinen, K., Vercauteren, F. & Verbauwhede, I. FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In Proc. IEEE International Symposium on High Performance Computer Architecture 387–398 (IEEE, 2019).

  44. Turan, F., Roy, S. S. & Verbauwhede, I. HEAWS: an accelerator for homomorphic encryption on the Amazon AWS FPGA. IEEE Trans. Comput. 69, 1185–1196 (2020).

    Google Scholar 

  45. Kim, S. et al. BTS: an accelerator for bootstrappable fully homomorphic encryption. In Proc. 49th Annual International Symposium on Computer Architecture 711–725 (ACM, 2022).

  46. Geelen, R. et al. BASALISC: Programmable hardware accelerator for BGV fully homomorphic encryption. In IACR Trans. Cryptogr. Hardw. Embed. Syst. 2023, 32–57 (2023).

  47. Hao, M. et al. Iron: private inference on transformers. Adv. Neural Inf. Process. Syst. 35, 15718–15731 (2022).

    Google Scholar 

  48. Pang, Q., Zhu, J., Möllering, H., Zheng, W. & Schneider, T. BOLT: privacy-preserving, accurate and efficient inference for transformers. In Proc. IEEE Symposium on Security and Privacy 4753–4771 (IEEE, 2024).

  49. Lee, E. et al. Low-complexity deep convolutional neural networks on fully homomorphic encryption using multiplexed parallel convolutions. In Proc. International Conference on Machine Learning 12403–12422 (PMLR, 2022).

  50. Mouchet, C., Troncoso-Pastoriza, J., Bossuat, J.-P. & Hubaux, J.-P. Multiparty homomorphic encryption from ring-learning-with-errors. Proc. Priv. Enhancing Technol. 2021, 291–311 (2021).

    Article  Google Scholar 

  51. Rakhsha, A., Radanovic, G., Devidze, R., Zhu, X. & Singla, A. Policy teaching via environment poisoning: training-time adversarial attacks against reinforcement learning. In Proc. International Conference on Machine Learning 7974–7984 (PMLR, 2020).

  52. Rakhsha, A., Zhang, X., Zhu, X. & Singla, A. Reward poisoning in reinforcement learning: attacks against unknown learners in unknown environments. Preprint at https://doi.org/10.48550/arXiv.2102.08492 (2021).

  53. Cascudo, I. et al. Verifiable computation for approximate homomorphic encryption schemes. In Proc. Annual International Cryptology Conference 643–677 (Springer, 2025).

  54. Santriaji, M. H., Xue, J., Zhang, Y., Lou, Q. & Solihin, Y. DataSeal: ensuring the verifiability of private computation on encrypted data. In Proc. IEEE Symposium on Security and Privacy 2378–2394 (IEEE, 2025).

  55. Gentry, C. A Fully Homomorphic Encryption Scheme (Stanford Univ., 2009).

    Google Scholar 

  56. Brakerski, Z., Gentry, C. & Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory 6, 13 (2014).

  57. Chillotti, I., Gama, N., Georgieva, M. & Izabachene, M. Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In Proc. 22nd International Conference on the Theory and Application of Cryptology and Information Security 3–33 (Springer, 2016).

  58. Kim, A., Song, Y., Kim, M., Lee, K. & Cheon, J. H. Logistic regression model training based on the approximate homomorphic encryption. BMC Med. Genomics 11, 23–31 (2018).

    Article  Google Scholar 

  59. Ramachandran, P., Zoph, B. & Le, Q. V. Searching for activation functions. In Proc. 6th International Conference on Learning Representations (ICLR, 2018).

  60. Misra, D. Mish: A self-regularized non-monotonic activation function. In Proc. 31st British Machine Vision Conference (BMVA, 2020).

  61. Adams, R. A. & Fournier, J. J. Sobolev Spaces (Elsevier, 2003).

    Google Scholar 

  62. Rimes, E. Sur le calcul effectif des polynômes d’approximation de Tchebycheff. C. R. Acad. Sci. Paris 199, 337–340 (1934).

  63. Bossuat, J.-P., Mouchet, C., Troncoso-Pastoriza, J. & Hubaux, J.-P. Efficient bootstrapping for approximate homomorphic encryption with non-sparse keys. In Proc. Annual International Conference on the Theory and Applications of Cryptographic Techniques 587–617 (Springer, 2021).

  64. hieunch. hieunch/PPRL: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.17038255 (2025).

Download references

Acknowledgements

This research was partially funded by the Australia–Vietnam Strategic Technologies Centre and the Australian Government through the Australian Research Council’s Discovery Projects funding scheme (project DE210100651, T.H.D.). M.K. was supported by the National Human Genome Research Institute grant number R01HG012604; the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (numbers 2021R1C1C1010173 and RS-2023-00211649); Basic Science Research Program through the NRF funded by the Ministry of Education (2020R1A6A1A06046728); and Korea Basic Science Institute (National research Facilities and Equipment Center) grant funded by the Ministry of Education (2023R1A6C101A009).

Author information

Authors and Affiliations

Authors

Contributions

T.H.D. conceived the original idea, which was subsequently refined and expanded by C.-H.N. and D.N.N. to develop the system model, initial framework and approach. M.K. and K.L. made substantial contributions to further enhance the solution and finalize the framework. All authors had key roles in designing the testing scenarios, conducting experiments and analysing the results. Additionally, all authors actively participated in drafting the manuscript and approved the final version for submission.

Corresponding authors

Correspondence to Chi-Hieu Nguyen or Miran Kim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary Notes 1 and 2, Supplementary Fig. 1, Supplementary Tables 1–3 and Supplementary References.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, CH., Dinh, T.H., Nguyen, D.N. et al. Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning. Nat Mach Intell 7, 1913–1926 (2025). https://doi.org/10.1038/s42256-025-01135-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01135-2

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics