TY - JOUR
T1 - AI-native cloud-edge orchestration for 6G metaverse networks
T2 - an LLM-guided multi-agent DRL approach
AU - Ayepah-Mensah, Daniel
AU - Ghebreziabiher, Amine Kidane
AU - Boateng, Gordon Owusu
AU - Mizouni, Rabeb
AU - Mourad, Azzam
AU - Otrok, Hadi
AU - Bentahar, Jamal
N1 - Publisher Copyright:
© The Author(s) 2026.
PY - 2026/4
Y1 - 2026/4
N2 - Emerging metaverse experiences, including interactive extended reality (XR) sessions and live holographic telepresence, necessitate motion-to-photon latencies of less than 10 ms. These applications must also manage the continuous streaming of multi-gigabit data volumes to thousands of mobile users. To meet these extreme requirements, an orchestration layer capable of instantly decomposing, placing, and adapting the dependency structures of microservices formally modeled as directed acyclic graphs (DAGs) underlying computationally intensive artificial intelligence (AI)-driven immersive applications is required. We propose an AI-native cloud-edge orchestration framework in which a Large Language Model (LLM) based cloud planner serves as a cognitive conductor. This planner uses Topology-Aware Retrieval-Augmented Generation (TopoRAG) to retrieve and interpret historical deployment traces to create latency-optimized orchestration plans. Trust-weighted logits, semantic cost estimates, and initial node bindings are output as soft priors and streamed to decentralized edge workers powered by deep reinforcement learning (DRL) with multiple agents. These DRL agents integrate global intentions with rapidly changing local conditions to enable real-time context-aware planning. In addition, we introduce a deviation-based reward mechanism that compares actual execution costs with estimates predicted by the LLM, providing dense and informative feedback that effectively halves the DRL convergence time. Simulations in urban-scale 6G networks with real-time volumetric video stitching and multiuser XR gaming workloads show a significant reduction in SLA violations and significantly lower end-to-end latency compared to baseline schedulers, while maintaining optimal motion-to-photon latency.
AB - Emerging metaverse experiences, including interactive extended reality (XR) sessions and live holographic telepresence, necessitate motion-to-photon latencies of less than 10 ms. These applications must also manage the continuous streaming of multi-gigabit data volumes to thousands of mobile users. To meet these extreme requirements, an orchestration layer capable of instantly decomposing, placing, and adapting the dependency structures of microservices formally modeled as directed acyclic graphs (DAGs) underlying computationally intensive artificial intelligence (AI)-driven immersive applications is required. We propose an AI-native cloud-edge orchestration framework in which a Large Language Model (LLM) based cloud planner serves as a cognitive conductor. This planner uses Topology-Aware Retrieval-Augmented Generation (TopoRAG) to retrieve and interpret historical deployment traces to create latency-optimized orchestration plans. Trust-weighted logits, semantic cost estimates, and initial node bindings are output as soft priors and streamed to decentralized edge workers powered by deep reinforcement learning (DRL) with multiple agents. These DRL agents integrate global intentions with rapidly changing local conditions to enable real-time context-aware planning. In addition, we introduce a deviation-based reward mechanism that compares actual execution costs with estimates predicted by the LLM, providing dense and informative feedback that effectively halves the DRL convergence time. Simulations in urban-scale 6G networks with real-time volumetric video stitching and multiuser XR gaming workloads show a significant reduction in SLA violations and significantly lower end-to-end latency compared to baseline schedulers, while maintaining optimal motion-to-photon latency.
KW - AI-native orchestration
KW - Deep reinforcement learning
KW - Edge computing
KW - Large language models
KW - Microservice placement
KW - Retrieval-augmented generation (RAG)
UR - https://www.scopus.com/pages/publications/105034872052
U2 - 10.1007/s40747-026-02249-9
DO - 10.1007/s40747-026-02249-9
M3 - Article
AN - SCOPUS:105034872052
SN - 2199-4536
VL - 12
JO - Complex and Intelligent Systems
JF - Complex and Intelligent Systems
IS - 4
M1 - 141
ER -