Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

An actor–critic algorithm to maximize the power delivered from direct methanol fuel cells

Abstract

Optimizing nonlinear time-dependent control in complex energy systems such as direct methanol fuel cells (DMFCs) is a crucial engineering challenge. The long-term power delivery of DMFCs deteriorates as the electrocatalytic surfaces become fouled. Dynamic voltage adjustment can clean the surface and recover the activity of catalysts; however, manually identifying optimal control strategies considering multiple mechanisms is challenging. Here we demonstrated a nonlinear policy model (Alpha-Fuel-Cell) inspired by actor–critic reinforcement learning, which learns directly from real-world current–time trajectories to infer the state of catalysts during operation and generates a suitable action for the next timestep automatically. Moreover, the model can provide protocols to achieve the required power while significantly slowing the degradation of catalysts. Benefiting from this model, the time-averaged power delivered is 153% compared to constant potential operation for DMFCs over 12 hours. Our framework may be generalized to other energy device applications requiring long-time-horizon decision-making in the real world.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic of our αFC system.
Fig. 2: Result of training and the control process of αFC.
Fig. 3: Controlling ability and analysis of the αFC system.
Fig. 4: Mechanism exploration of αFC system by ECMS.
Fig. 5: Controlling and maximizing the delivered power by αFC system.

Similar content being viewed by others

Data availability

The data that support the findings of this study are available within the Article and its Supplementary Information. Source data are provided with this paper.

Code availability

The αFC code associated with this manuscript is available via GitHub at https://github.com/parkyjmit/alphaFC.

References

  1. Yao, Z. et al. Machine learning for a sustainable energy future. Nat. Rev. Mater. 8, 202–215 (2022).

    Article  Google Scholar 

  2. Attia, P. M. et al. Closed-loop optimization of fast-charging protocols for batteries with machine learning. Nature 578, 397–402 (2020).

    Article  Google Scholar 

  3. Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).

    Article  MathSciNet  Google Scholar 

  4. Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4, 1578–1597 (2021).

    Article  Google Scholar 

  5. Yao, Z. et al. Inverse design of nanoporous crystalline reticular materials with deep generative models. Nat. Mach. Intell. 3, 76–86 (2021).

    Article  Google Scholar 

  6. Jiao, K. et al. Designing the next generation of proton-exchange membrane fuel cells. Nature 595, 361–369 (2021).

    Article  Google Scholar 

  7. Zhu, G. et al. Rechargeable Na/Cl2 and Li/Cl2 batteries. Nature 596, 525–530 (2021).

    Article  Google Scholar 

  8. Wu, Y., Jiang, Z., Lu, X., Liang, Y. & Wang, H. Domino electroreduction of CO2 to methanol on a molecular catalyst. Nature 575, 639–642 (2019).

    Article  Google Scholar 

  9. Davidson, D. J. Exnovating for a renewable energy transition. Nat. Energy 4, 254–256 (2019).

    Article  Google Scholar 

  10. Feng, Y., Liu, H. & Yang, J. A selective electrocatalyst–based direct methanol fuel cell operated at high concentrations of methanol. Sci. Adv. 3, e1700580 (2017).

    Article  Google Scholar 

  11. Martín, A. J., Mitchell, S., Mondelli, C., Jaydev, S. & Pérez-Ramírez, J. Unifying views on catalyst deactivation. Nat. Catal. 5, 854–866 (2022).

    Article  Google Scholar 

  12. Poerwoprajitno, A. R. et al. A single-Pt-atom-on-Ru-nanoparticle electrocatalyst for CO-resilient methanol oxidation. Nat. Catal. 5, 231–237 (2022).

    Article  Google Scholar 

  13. Wang, J. et al. Toward electrocatalytic methanol oxidation reaction: longstanding debates and emerging catalysts. Adv. Mater. 35, 2211099 (2023).

    Article  Google Scholar 

  14. Xu, H. et al. A cobalt-platinum-ruthenium system for acidic methanol oxidation. Chem. Mater. 36, 6938–6949 (2024).

    Article  Google Scholar 

  15. Timoshenko, J. et al. Steering the structure and selectivity of CO2 electroreduction catalysts by potential pulses. Nat. Catal. 5, 259–267 (2022).

    Article  Google Scholar 

  16. Xu, L. et al. In situ periodic regeneration of catalyst during CO2 electroreduction to C2+ products. Angew. Chem. Int. Ed. 61, e202210375 (2022).

    Article  Google Scholar 

  17. Zhang, X.-D. et al. Asymmetric low-frequency pulsed strategy enables ultralong CO2 reduction stability and controllable product selectivity. J. Am. Chem. Soc. 145, 2195–2206 (2023).

    Article  Google Scholar 

  18. Rabissi, C., Brightman, E., Hinds, G. & Casalegno, A. In operando investigation of anode overpotential dynamics in direct methanol fuel cells. Int. J. Hydrogen Energy 41, 18221–18225 (2016).

    Article  Google Scholar 

  19. Rabissi, C., Brightman, E., Hinds, G. & Casalegno, A. In operando measurement of localised cathode potential to mitigate DMFC temporary degradation. Int. J. Hydrogen 43, 9797–9802 (2018).

    Article  Google Scholar 

  20. Bresciani, F., Casalegno, A., Bonde, J. L., Odgaard, M. & Marchesi, R. A comparison of operating strategies to reduce DMFC degradation: a comparison of operating strategies to reduce DMFC degradation. Int. J. Energy Res. 38, 117–124 (2014).

    Article  Google Scholar 

  21. Kusne, A. G. et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat. Commun. 11, 5966 (2020).

    Article  Google Scholar 

  22. Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).

    Article  Google Scholar 

  23. Severson, K. A. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 4, 383–391 (2019).

    Article  Google Scholar 

  24. Seo, J. et al. Avoiding fusion plasma tearing instability with deep reinforcement learning. Nature 626, 746–751 (2024).

    Article  Google Scholar 

  25. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article  Google Scholar 

  26. Silver, D. et al. Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  Google Scholar 

  27. Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).

    Article  Google Scholar 

  28. Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).

  29. Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. Soft actor–critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proc. 35th International Conference on Machine Learning 1861–1870 (PMLR, 2018).

  30. Hessel, M. et al. Rainbow: combining improvements in deep reinforcement learning. In Thirty-second AAAI Conference on Artificial Intelligence (AAAI-18) 3215–3222 (AAAI, 2018).

  31. Mnih, V. et al. Playing Atari with deep reinforcement learning. Preprint at https://arxiv.org/abs/1312.5602 (2013).

  32. Panzer, M. & Bender, B. Deep reinforcement learning in production systems: a systematic literature review. Int. J. Prod. Res. 60, 4316–4341 (2022).

    Article  Google Scholar 

  33. Degrave, J. et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 414–419 (2022).

    Article  Google Scholar 

  34. Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019).

    Article  Google Scholar 

  35. Kort-Kamp, W. J. M. et al. Machine learning-guided design of direct methanol fuel cells with a platinum group metal-free cathode. J. Power Sources 626, 235758 (2025).

    Article  Google Scholar 

  36. Sweigart, A. PyAutoGUI documentation. Read the Docs https://pyautogui.readthedocs.org (2020).

Download references

Acknowledgements

We acknowledge support by the Defense Advanced Research Projects Agency (DARPA) under agreement number HR00112490369. This work made use of the Materials Research Laboratory Shared Experimental Facilities at Massachusetts Institute of Technology. This work was carried out in part through the use of MIT.nano’s facilities. This work was performed in part at the Center for Nanoscale Systems of Harvard University. H.X. acknowledges Y. Rao for discussion and support. Y.J.P. also acknowledges support by a grant from the National Research Foundation of Korea (NRF) funded by the Korean government, Ministry of Science and ICT (MSIT) (number RS-2024-00356670).

Author information

Authors and Affiliations

Authors

Contributions

J.L., H.X. and Y.J.P. conceived the original idea. H.X. performed the synthesis, the electrochemical measurements, data collection and model construction. Y.J.P. developed and trained the αFC model. Z.R. developed automated measurements for experiments. H.X., D.J.Z. and D.M. performed ECMS measurements. D.Z., H.J., C.D. and G.Z. participated in the discussion. H.X. and Y.J.P. verified and analysed data. H.X., Y.J.P., J.L., Y.S.-H. and Y.R.-L. drafted the manuscript. All authors contributed to the revision of the manuscript.

Corresponding authors

Correspondence to Yang Shao-Horn or Ju Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Energy thanks Zhongbao Wei, Lei Xing and the other, anonymous, reviewer for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Discussion and Figs. 1–28.

Supplementary Video 1

Demonstration video of Alpha-Fuel-Cell operation.

Source data

Source Data Fig. 2

Source data for Fig. 2.

Source Data Fig. 3

Source data for Fig. 3.

Source Data Fig. 4

Source data for Fig. 4.

Source Data Fig. 5

Source data for Fig. 5.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, H., Park, Y.J., Ren, Z. et al. An actor–critic algorithm to maximize the power delivered from direct methanol fuel cells. Nat Energy 10, 951–961 (2025). https://doi.org/10.1038/s41560-025-01804-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41560-025-01804-x

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing