Abstract
The rapid proliferation of Artificial Intelligence applications necessitates scalable solutions that perform efficiently under real-world constraints. Heterogeneous accelerators combining specialized analog and digital units offer localized, energy-efficient neural network computations. However, achieving optimal performance on these platforms requires balancing energy efficiency and model accuracy through optimized neural network layer mapping. To this end, we introduce Mixed-Precision Supernetwork, a unified framework for training mixed-precision supernetworks that seamlessly integrate quantized layers with analog noise-sensitive layers. Mixed-Precision Supernetwork incorporates a mapping-aware adaptation strategy, dynamically optimizing layer assignments while refining the neural network via hardware-aware architecture search. This dual innovation establishes Mixed-Precision Supernetwork as a groundbreaking approach for deploying deep learning models efficiently on heterogeneous accelerators. On average, Mixed-Precision Supernetwork produces mappings ~ 2.2 × faster and achieves a ~ 3.4% increase in model accuracy over a fully analog approach, while improving energy-efficiency by mapping up to 80% of the model’s weights to analog hardware while maintaining full-precision accuracy.
Similar content being viewed by others
Data availability
The raw data for all figures, and tables can be found here https://github.com/IBM/analog-nas/tree/main/MPS.
Code availability
The code used to perform the simulations included in this study is available at https://github.com/IBM/analog-nas/tree/main/MPS
References
Touvron, H. et al. Llama: Open and efficient foundation language models, arXiv preprint arXiv:2302.13971 (2023).
Radford, A. et al. Learning transferable visual models from natural language supervision, in Proceedings of the 38th International Conference on Machine Learning, ICML, Proceedings of Machine Learning Research 139 (eds Meila, M. & Zhang, T.) pp. 8748–8763 (PMLR, 2021).
Schlichtkrull, M. et al. Modeling relational data with graph convolutional networks, in The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15 pp. 593–607 (Springer, 2018).
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).
Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys. X 2, 89–124 (2017).
Yu, S. Neuro-inspired computing with emerging nonvolatile memorys. Proc. IEEE 106, 260–285 (2018).
Ambrogio, S. et al. An analog-AI chip for energy-efficient speech recognition and transcription. Nature 620, 768–775 (2023).
Rasch, M. J. et al. Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators. Nat. Commun. 14, 5282 (2023).
Vasilopoulos, A. et al. Exploiting the state dependency of conductance variations in memristive devices for accurate in-memory computing. IEEE Trans. Electron Devices 70, 6279–6285 (2023).
Büchel, J. et al. Programming weights to analog in-memory computing cores by direct minimization of the matrix-vector multiplication error. IEEE J. Emerg. Sel. Top. Circuits Syst. 13, 1052–1061 (2023).
Joshi, V. et al. Accurate deep neural network inference using computational Phase-Change Memory, http://arxiv.org/abs/1906.03138CoRRabs/1906.03138 1906.03138 (2019).
Lepri, N. et al. In-memory computing for machine learning and deep learning. in IEEE J. Electron Devices Society (2023).
Jain, S. et al. A heterogeneous and programmable compute-in-memory accelerator architecture for analog-ai using dense 2-D mesh. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 31, 114–127 (2022).
Boybat-Kara, I. et al. Heterogeneous embedded neural processing units utilizing PCM-based analog in-memory computing. In IEEE International Electron Devices Meeting (2024).
Le Gallo, M. et al. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron. 6, 680–693 (2023).
Ueyoshi, K. et al. DIANA: An end-to-end energy-efficient digital and analog hybrid neural network soc, in https://doi.org/10.1109/ISSCC42614.2022.97317162022 IEEE International Solid-State Circuits Conference (ISSCC), Vol. 65 pp. 1–3 (2022).
Behnam, P., Kamal, U. & Mukhopadhyay, S. An algorithm-hardware co-design framework to overcome imperfections of mixed-signal dnn accelerators, arXiv preprint arXiv:2208.13896 (2022).
Risso, M. et al. Precision-aware latency and energy balancing on multi-accelerator platforms for DNN inference, in ISLPED’23: Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (IEEE, 2023).
Lammie, Corey, et al. Lionheart: A layer-based mapping framework for heterogeneous systems with analog in-memory computing tiles. IEEE Transactions on Emerging Topics in Computing (2025).
Cai, H., Zhu, L. & Han, S. ProxylessNAS: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations (2018).
Myung, S. et al. A 14 nm MRAM-based multi-bit analog in-memory computing with process-variation calibration for 72 macros-based accelerator. IEEE Solid-State Circuits Lett. 7, 295–298 (2024).
Mattson, P. et al. MLperf: an industry standard benchmark suite for machine learning performance. IEEE Micro 40, 8–16 (2020).
Le Gallo, M. et al. Using the IBM analog in-memory hardware acceleration kit for neural network training and inference. APL Mach. Learn. 1, 4041102 (2023).
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019).
IBM. 3D-CiM-LLM-Inference-Simulator. GitHub repository. Accessed 2026-02-16.
Chu, X., Zhang, B. & Xu, R. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. In Proceedings of the IEEE/CVF International Conference on Computer Vision pp. 12239–12248 (2021).
Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference, in https://doi.org/10.1109/CVPR.2018.00286 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 2704–2713 (2018).
Benmeziane, H., El Maghraoui, K., Ouarnoughi, H. & Niar, S. Pareto rank-preserving supernetwork for hardware-aware neural architecture search. In ECAI 2023 (IOS Press, 2023) pp. 239–246.
Ma, L. et al. How to simplify search: classification-wise Pareto evolution for one-shot neural architecture search, arXiv preprint arXiv:2109.07582 (2021).
Acknowledgements
We thank R. Haas, J. Burns, M. Khare for management support.
Author information
Authors and Affiliations
Contributions
H.B. and A.S. initiated the project. H.B. designed and planned the project. C.L., I.B., A.V. and M.R. set up the infrastructure and tools required. M.L.G., H.T., and G.W.B. contributed to the discussion of the project and assisted with revisions of the manuscript. H.B. wrote the manuscript with input from all authors. V.N., K.E.M., and A.S. supervised the work.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Huaqiang Wu and Daniele Jahier Pagliari, who co-reviewed with Beatrice Alessandra Motetti, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Benmeziane, H., Lammie, C., Boybat, I. et al. Supernetwork-based efficient mapping of deep learning applications to mixed-precision hardware using model adaptation. Nat Commun (2026). https://doi.org/10.1038/s41467-026-71071-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-026-71071-1


