Skip to main navigation Skip to search Skip to main content

AI-native cloud-edge orchestration for 6G metaverse networks: an LLM-guided multi-agent DRL approach

  • Daniel Ayepah-Mensah
  • , Amine Kidane Ghebreziabiher
  • , Gordon Owusu Boateng
  • , Rabeb Mizouni
  • , Azzam Mourad*
  • , Hadi Otrok
  • , Jamal Bentahar
  • *Corresponding author for this work
  • Khalifa University of Science and Technology
  • Lebanese American University
  • Concordia University

Research output: Contribution to journalArticlepeer-review

Abstract

Emerging metaverse experiences, including interactive extended reality (XR) sessions and live holographic telepresence, necessitate motion-to-photon latencies of less than 10 ms. These applications must also manage the continuous streaming of multi-gigabit data volumes to thousands of mobile users. To meet these extreme requirements, an orchestration layer capable of instantly decomposing, placing, and adapting the dependency structures of microservices formally modeled as directed acyclic graphs (DAGs) underlying computationally intensive artificial intelligence (AI)-driven immersive applications is required. We propose an AI-native cloud-edge orchestration framework in which a Large Language Model (LLM) based cloud planner serves as a cognitive conductor. This planner uses Topology-Aware Retrieval-Augmented Generation (TopoRAG) to retrieve and interpret historical deployment traces to create latency-optimized orchestration plans. Trust-weighted logits, semantic cost estimates, and initial node bindings are output as soft priors and streamed to decentralized edge workers powered by deep reinforcement learning (DRL) with multiple agents. These DRL agents integrate global intentions with rapidly changing local conditions to enable real-time context-aware planning. In addition, we introduce a deviation-based reward mechanism that compares actual execution costs with estimates predicted by the LLM, providing dense and informative feedback that effectively halves the DRL convergence time. Simulations in urban-scale 6G networks with real-time volumetric video stitching and multiuser XR gaming workloads show a significant reduction in SLA violations and significantly lower end-to-end latency compared to baseline schedulers, while maintaining optimal motion-to-photon latency.

Original languageEnglish
Article number141
JournalComplex and Intelligent Systems
Volume12
Issue number4
DOIs
Publication statusPublished - Apr 2026

Keywords

  • AI-native orchestration
  • Deep reinforcement learning
  • Edge computing
  • Large language models
  • Microservice placement
  • Retrieval-augmented generation (RAG)

Cite this