Empowering generative AI through mobile edge computing

Ale, Laha; Zhang, Ning; King, Scott A.; Chen, Dajiang

doi:10.1038/s44287-024-00053-6

Perspective
Published: 14 June 2024

Empowering generative AI through mobile edge computing

Nature Reviews Electrical Engineering volume 1, pages 478–486 (2024)Cite this article

3936 Accesses
30 Citations
5 Altmetric
Metrics details

Subjects

Abstract

Generative artificial intelligence (GenAI) has brought about profound transformations across the diverse domains of the Internet of Things such as manufacturing, marketing, medicine, education and work assistance. However, the proliferation of computationally intensive and highly complex GenAI models poses substantial challenges to servers and central network capacities. To effectively permeate various facets of our lives, GenAI heavily relies on mobile edge computing. In this Perspective article, we first introduce GenAI applications on edge devices highlighting its potential capacity to revolutionize our everyday life. We then outline the challenges associated with deploying GenAI on edge devices and present possible solutions to effectively address these obstacles. Finally, we introduce an intelligent mobile edge computing paradigm able to reduce response latency, improve efficiency, strengthen security and privacy preservation and conserve energy, opening the way to a sustainable and efficient application of the different GenAI models.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Schematic representation of intelligent mobile edge computing for the integration of generative artificial intelligence (GenAI) deployment across three distinct computational strata.**

An edge server placement based on graph clustering in mobile edge computing

Article Open access 02 December 2024

Efficient GPT-4V level multimodal large language model for deployment on edge devices

Article Open access 01 July 2025

Multi-user joint task offloading and resource allocation based on mobile edge computing in mining scenarios

Article Open access 09 May 2025

References

Altman, R. et al. Generative AI: Perspectives from Stanford HAI. Stanford University https://hai.stanford.edu/sites/default/files/2023-03/Generative_AI_HAI_Perspectives.pdf (2023).
Kingma, D. P. & Welling, M. Auto-encoding variational bayes. In 2nd International Conference on Learning Representations (ICLR, 2014).
Goodfellow, I. J. et al. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 3, 2672–2680 (2014).
Google Scholar
van den Oord, A., Kalchbrenner, N. & Kavukcuoglu, K. Pixel recurrent neural networks. In Proc. 33rd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 1747–1756 (ICML, 2016).
Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F. & Lin, H.) 6840–6851 (Curran Associates, Inc., 2020).
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).
Article Google Scholar
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1 (eds Burstein, J., Doran, C. & Solario, T.) 4171–4186 (ACL, 2019).
Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at arXiv https://doi.org/10.48550/arXiv.2302.13971 (2023).
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at arXiv https://doi.org/10.48550/arXiv.2303.12712 (2023).
Ramesh, A. et al. Zero-shot text-to-image generation. In Proc. 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 8821–8831 (PMLR, 2021).
Girdhar, R. et al. ImageBind one embedding space to bind them all. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 15180–15190 (IEEE, 2023).
Pichar, S. & Hassabis, D. Introducing Gemini: our largest and most capable AI model. Google https://blog.google/technology/ai/google-gemini-ai (2023).
OpenAI. Creating video from text. https://openai.com/sora (2024).
An, J., Ding, W. & Lin, C. ChatGPT: tackle the growing carbon footprint of generative AI. Nature 615, 586 (2023).
Article Google Scholar
OpenAI. OpenAI API. https://openai.com/blog/openai-api (2020).
OpenAI. Introducing the GPT Store. https://openai.com/blog/introducing-the-gpt-store (2024).
Ale, L., Zhang, N., Wu, H., Chen, D. & Han, T. Online proactive caching in mobile edge computing using bidirectional deep recurrent neural network. IEEE Internet Things J. 6, 5520–5530 (2019).
Article Google Scholar
Mao, Y., You, C., Zhang, J., Huang, K. & Letaief, K. B. A survey on mobile edge computing: the communication perspective. IEEE Commun. Surv. Tutor. 19, 2322–2358 (2017).
Article Google Scholar
Wang, S. et al. Delay-aware microservice coordination in mobile edge computing: a reinforcement learning approach. IEEE Trans. Mob. Comput. 20, 939–951 (2021).
Article Google Scholar
Ale, L. et al. Delay-aware and energy-efficient computation offloading in mobile edge computing using deep reinforcement learning. IEEE Trans. Cogn. Commun. Netw. 7, 881–892 (2021).
Article Google Scholar
Li, M., Gao, J., Zhao, L. & Shen, X. Deep reinforcement learning for collaborative edge computing in vehicular networks. IEEE Trans. Cogn. Commun. Netw. 7731, 1–14 (2020).
Google Scholar
Holmes, A. & Gardizy, A. AI developers stymied by server shortage at AWS, Microsoft, Google. The Information https://www.theinformation.com/articles/ai-developers-stymied-by-server-shortage-at-aws-microsoft-google (2023).
Cave, S. & Cammers-Goodwin, S. in What Matters Most (ed. Morgan, A.) Ch. 17 (Cambridge University Press, 2024).
Chakraborty, C., Bhattacharya, M. & Lee, S.-S. Need an AI-enabled, next-generation, advanced chatGPT or large language models (LLMs) for error-free and accurate medical information. Ann. Biomed. Eng. 52, 134–135 (2023).
Article Google Scholar
Llopis-Albert, C., Rubio, F. & Valero, F. Impact of digital transformation on the automotive industry. Technol. Forecast. Soc. Change 162, 120343 (2021).
Article Google Scholar
Nicholls, L., Strengers, Y. & Sadowski, J. Social impacts and control in the smart home. Nat. Energy 5, 180–182 (2020).
Article Google Scholar
Wu, H., Yan, Y., Sun, D. & Simon, R. A customized real-time compilation for motion control in embedded PLCs. IEEE Trans. Ind. Inform. 15, 812–821 (2019).
Article Google Scholar
Véliz, C. Privacy and digital ethics after the pandemic. Nat. Electron. 4, 10–11 (2021).
Article Google Scholar
Pundlik, S., Shivshanker, P. & Luo, G. Impact of apps as assistive devices for visually impaired persons. Annu. Rev. Vis. Sci. 9, 111–130 (2023).
Article Google Scholar
Kamilaris, A. O. Geospatial analysis and the internet of things. ISPRS Int. J. Geo-Inf. 7, 269 (2018).
Article Google Scholar
Gozalo-Brizuela, R. & Garrido-Merchán, E. C. A survey of generative AI applications. Preprint at arXiv https://doi.org/10.48550/arXiv.2306.02781 (2023).
Meena, Y. K. & Arya, K. V. Multimodal interaction and IoT applications. Multimed. Tools Appl. 82, 4781–4785 (2023).
Article Google Scholar
Shi, W., Cao, J., Zhang, Q., Li, Y. & Xu, L. Edge computing: vision and challenges. IEEE Internet Things J. 3, 637–646 (2016).
Article Google Scholar
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Article Google Scholar
Tyler, J., Choi, S. W. & Tewari, M. Real-time, personalized medicine through wearable sensors and dynamic predictive modeling: a new paradigm for clinical medicine. Curr. Opin. Syst. Biol. 20, 17–25 (2020).
Article Google Scholar
Chen, C. et al. Deep learning based pedestrian inertial navigation: methods, dataset and on-device inference. IEEE Internet Things J. 7, 4431–4441 (2020).
Article Google Scholar
Taniguchi, A., Hagiwara, Y., Taniguchi, T. & Inamura, T. Spatial concept-based navigation with human speech instructions via probabilistic inference on Bayesian generative model. Adv. Robot. 34, 1213–1228 (2020).
Article Google Scholar
Park, J. S. et al. Generative agents: interactive simulacra of human behavior. In Proc. 36th Annual ACM Symposium on User Interface Software and Technology 1–22 (Association for Computing Machinery, 2023).
Li, J. et al. Towards ubiquitous personalized music recommendation with smart bracelets. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 125 (2022).
Article Google Scholar
Kozyreva, A. et al. Public attitudes towards algorithmic personalization and use of personal data online: evidence from Germany, Great Britain, and the United States. Humanit. Soc. Sci. Commun. 8, 117 (2021).
Article Google Scholar
Pataranutaporn, P. et al. AI-generated characters for supporting personalized learning and well-being. Nat. Mach. Intell. 3, 1013–1022 (2021).
Article Google Scholar
Kubo, Y., Takada, R., Shizuki, B. & Takahashi, S. Exploring context-aware user interfaces for smartphone-smartwatch cross-device interaction. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 68 (2017).
Article Google Scholar
Cipresso, P., Giglioli, I. A. C., Raya, M. A. & Riva, G. The past, present, and future of virtual and augmented reality research: a network and cluster analysis of the literature. Front. Psychol. 9, 2086 (2018).
Article Google Scholar
Johnson, M. et al. Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans. Assoc. Comput. Linguist. 5, 339–351 (2017).
Article Google Scholar
Coldewey, D. Universal translator’ dubs and lip-syncs speakers — but Google warns against misuse. Techcrunch https://techcrunch.com/2023/05/10/universal-translator-dubs-and-lip-syncs-speakers-but-google-warns-against-misuse/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAEDhitJGJbMhVC893e0JT78M57jNhJLsThEL07A1Wt0EZsLS4j2S-QrKqW2YIzyntOyCBZ5hTJnR0VWy9Cz_vxDMAWjYM-S7skFGOUYoje5A-RNoBdXYzuh8LEgBPkEG-TtCt2hGEbOCNjvr5IiDvvnBi-jYRx_dpqlT9EYzMbiX (2023).
Ghanvatkar, S., Kankanhalli, A. & Rajan, V. User models for personalized physical activity interventions: scoping review. JMIR mHealth uHealth 7, e11098 (2019).
Article Google Scholar
Wackerhage, H. & Schoenfeld, B. J. Personalized, evidence-informed training plans and exercise prescriptions for performance, fitness and health. Sport. Med. 51, 1805–1813 (2021).
Article Google Scholar
Zhang, C., Lakens, D. & Jsselsteijn, W. A. I. Theory integration for lifestyle behavior change in the digital age: an adaptive decision-making framework. J. Med. Internet Res. 23, 12–19 (2021).
Google Scholar
Bharadwaj, H. K. et al. A review on the role of machine learning in enabling IoT based healthcare applications. IEEE Access 9, 38859–38890 (2021).
Article Google Scholar
Espinoza, H., Kling, G., McGroarty, F., O’Mahony, M. & Ziouvelou, X. Estimating the impact of the internet of things on productivity in Europe. Heliyon 6, e03935 (2020).
Article Google Scholar
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article Google Scholar
Zhao, J. et al. GaLore: memory-efficient LLM training by gradient low-rank projection. Preprint at arXiv https://doi.org/10.48550/arXiv.2403.03507 (2024).
Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems Vol. 33 (eds Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F. & Lin, H.) 1877–1901 (Curran Associates, Inc., 2020).
Wang, L., Zhang, Y. & Bruce, P. G. Batteries for wearables. Natl Sci. Rev. 10, nwac062 (2022).
Article Google Scholar
Curry, E. et al. Internet of things enhanced user experience for smart water and energy management. IEEE Internet Comput. 22, 18–28 (2018).
Article Google Scholar
Al-Turjman, F. Energy-aware data delivery framework for safety-oriented mobile IoT. IEEE Sens. J. 18, 470–478 (2018).
Article Google Scholar
Ogonji, M. M., Okeyo, G. & Wafula, J. M. A survey on privacy and security of Internet of Things. Comput. Sci. Rev. 38, 100312 (2020).
Article MathSciNet Google Scholar
Landau, S. The real security issues of the iPhone case. Science 352, 1398–1399 (2016).
Article Google Scholar
Chen, J. et al. IRAF: a deep reinforcement learning approach for collaborative mobile edge computing IoT networks. IEEE Internet Things J. 6, 7011–7024 (2019).
Article Google Scholar
Sudharsan, B. et al. Toward distributed, global, deep learning using IoT devices. IEEE Internet Comput. 25, 6–12 (2021).
Article Google Scholar
Ray, P. P. A review on TinyML: state-of-the-art and prospects. J. King Saud. Univ. Comput. Inf. Sci. 34, 1595–1623 (2022).
Google Scholar
Barik, R. K., Dubey, H., Samaddar, A. B., Gupta, R. D. & Ray, P. K. FogGIS: Fog computing for big data analytics. In Proc. 2016 IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics Engineering (UPCON) 613–618 (IEEE, 2016).
Xu, Y. et al. Adaptive control of local updating and model compression for efficient federated learning. IEEE Trans. Mob. Comput. 22, 5675–5689 (2022).
Article Google Scholar
Ale, L. Intelligent Mobile Edge Computing. PhD thesis, Texas A&M Univ. Corpus Christi (2021).
Abbas, N., Zhang, Y., Taherkordi, A. & Skeie, T. Mobile edge computing: a survey. IEEE Internet Things J. 5, 450–465 (2018).
Article Google Scholar
Osterrieder, P., Budde, L. & Friedli, T. The smart factory as a key construct of industry 4.0: a systematic literature review. Int. J. Prod. Econ. 221, 107476 (2020).
Article Google Scholar
Brand, J., Israeli, A. & Ngwe, D. Using GPT for market research. Working Paper No. 23-062. Harvard Business School https://www.hbs.edu/faculty/Pages/item.aspx?num=63859 (2023).
Kumar, Y., Koul, A., Singla, R. & Ijaz, M. F. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J. Ambient. Intell. Humaniz. Comput. 14, 8459–8486 (2023).
Article Google Scholar
Gan, W., Qi, Z., Wu, J. & Lin, J. C.-W. Large language models in education: vision and opportunities. In 2023 IEEE International Conference on Big Data 4776–4785 (IEEE, 2023).
Brynjolfsson, E., Li, D. & Raymond, R. L. Generative AI at work. Working Paper No. 4141. Stanford Graduate School of Business https://www.gsb.stanford.edu/faculty-research/working-papers/generative-ai-work (2023).
Ale, L., Zhang, N., King, S. A. & Guardiola, J. Spatio-temporal Bayesian learning for mobile edge computing resource planning in smart cities. ACM Trans. Internet Technol. 21, 72 (2021).
Ale, L., King, S. A., Zhang, N., Sattar, A. R. & Skandaraniyam, J. D3PG: Dirichlet DDPG for task partitioning and offloading with constrained hybrid action space in mobile edge computing. IEEE Internet Things J. 9, 19260–19272 (2022).
Article Google Scholar
Li, X. & Da Xu, L. A review of internet of things — resource allocation. IEEE Internet Things J. 8, 8657–8666 (2021).
Article Google Scholar
Han, S., Mao, H. & Dally, W. J. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In 4th International Conference on Learning Representations (ICLR, 2016).
Hinton, G., Vinyals, O. & Dean, J. Distilling the knowledge in a neural network. In NIPS Deep Learning and Representation Learning Workshop (NIPS, 2015).
Yu, X., Liu, T., Wang, X. & Tao, D. On compressing deep models by low rank and sparse decomposition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 67–76 (IEEE, 2017).
Tang, H., Liu, Z., Li, X., Lin, Y. & Han, S. TorchSparse: efficient point cloud inference engine. In 7th Conference on Machine Learning and Systems (MLSys, 2022).
Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2704–2713 (IEEE, 2018).
Cai, H., Gan, C., Wang, T., Zhang, Z. & Han, S. Once-for-All: train one network and specialize it for efficient deployment. In 8th International Conference on Learning Representations (ICLR, 2020).
Liu, Z., Yang, X., Tang, H., Yang, S. & Han, S. FlatFormer: flattened window attention for efficient point cloud transformer. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1200–1211 (IEEE, 2023).
McMahan, H. B., Moore, E., Ramage, D. & y Arcas, B. A. Federated learning of deep networks using model averaging. In Proc. 20th International Conference on Artificial Intelligence and Statistics 1273–1282 (PMLR, 2017).
Kumar, M. et al. A smart privacy preserving framework for industrial IoT using hybrid meta-heuristic algorithm. Sci. Rep. 13, 5372 (2023).
Article Google Scholar
Siriwardhana, Y., Porambage, P., Liyanage, M. & Ylianttila, M. A survey on mobile augmented reality with 5G mobile edge computing: architectures, applications, and technical aspects. IEEE Commun. Surv. Tutor. 23, 1160–1192 (2021).
Article Google Scholar
Cao, X., Wang, F., Xu, J., Zhang, R. & Cui, S. Joint computation and communication cooperation for energy-efficient mobile edge computing. IEEE Internet Things J. 6, 4188–4200 (2019).
Article Google Scholar
Xu, Z., Jain, S. & Kankanhalli, M. S. Hallucination is inevitable: an innate limitation of large language models. Preprint at arXiv https://arxiv.org/abs/2401.11817 (2024).

Download references

Author information

Authors and Affiliations

School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China
Laha Ale
Department of Electrical and Computer Engineering, University of Windsor, Ontario, Canada
Ning Zhang
Department of Computer Science, Texas A&M University Corpus Christi, Corpus Christi, TX, USA
Scott A. King
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
Dajiang Chen

Authors

Laha Ale
View author publications
Search author on:PubMed Google Scholar
Ning Zhang
View author publications
Search author on:PubMed Google Scholar
Scott A. King
View author publications
Search author on:PubMed Google Scholar
Dajiang Chen
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors contributed equally to the preparation of this manuscript.

Corresponding authors

Correspondence to Laha Ale or Dajiang Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Electrical Engineering thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ale, L., Zhang, N., King, S.A. et al. Empowering generative AI through mobile edge computing. Nat Rev Electr Eng 1, 478–486 (2024). https://doi.org/10.1038/s44287-024-00053-6

Download citation

Accepted: 24 April 2024
Published: 14 June 2024
Issue date: July 2024
DOI: https://doi.org/10.1038/s44287-024-00053-6

This article is cited by

Lightweight deep learning for real-time road distress detection on mobile devices
- Yuanyuan Hu
- Ning Chen
- Pengfei Liu
Nature Communications (2025)
Seamless optical cloud computing across edge-metro network for generative AI
- Sizhe Xing
- Aolong Sun
- Junwen Zhang
Nature Communications (2025)
An enhancing privacy training architecture for federal vehicle networking
- Yuzhou Dai
- Qian Cheng
- Guangxin Li
Peer-to-Peer Networking and Applications (2025)
Privacy-preserving federated learning scheme for distributed smart grid based on multi-key homomorphic encryption
- Penglin Zhang
- Yong Zhang
- Zhenghua Gu
Peer-to-Peer Networking and Applications (2025)
Mobile AI: Communication and Mobility After the Smartphone
- Gerard Goggin
Communication and Change (2025)