A RAG-Assisted DRL Framework for Microservices Deployment in 6G Vehicular Networks

  • Daniel Ayepah-Mensah
  • , Amine Kidane Ghebreziabiher
  • , Gordon Owusu Boateng
  • , Rabeb Mizouni
  • , Azzam Mourad
  • , Hadi Otrok
  • , Jamal Bentahar
  • , Sami Muhaidat

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Modern edge cloud platforms must efficiently deploy and route containerized microservice DAGs under strict latency and cost constraints, while adapting to rapidly changing workloads and infrastructure states. Deep Reinforcement Learning (DRL) schedulers adapt well to dynamics but often lack semantic awareness of service intent and task dependencies, resulting in suboptimal decisions in unseen scenarios. To overcome these limitations, we introduce a Retrieval-Augmented Generation-assisted DRL (RAG-DRL) framework that integrates a lightweight DRL agent with a graph-based RAG module powered by a partially frozen LLM. A dynamic memory graph encodes contextual information such as node resources, network latencies, and SLA feedback. The LLM retrieves relevant historical deployments and current service intents to generate soft placement plans and reward estimates, which guide the DRL agent. These priors accelerate convergence, improve generalization across diverse conditions, and ensure real-time responsiveness. Evaluations on a realistic urban-scale edge cloud testbed confirm that RAG-DRL significantly reduces SLA violations, end-to-end latency, and resource imbalance, outperforming modern container-based schedulers. Our framework converges faster, maintains latency below 65 ms on scale, limits SLA violations to 12% under heavy load, and achieves 90 % resource utilization with balanced distribution.

Original languageEnglish
Title of host publication2025 21st International Conference on Wireless and Mobile Computing, Networking and Communications, WiMob 2025
PublisherIEEE Computer Society
ISBN (Electronic)9798350392814
DOIs
Publication statusPublished - 2025
Event21st International Conference on Wireless and Mobile Computing, Networking and Communications, WiMob 2025 - Marrakesh, Morocco
Duration: 20 Oct 202522 Oct 2025

Publication series

NameInternational Conference on Wireless and Mobile Computing, Networking and Communications
ISSN (Print)2161-9646
ISSN (Electronic)2161-9654

Conference

Conference21st International Conference on Wireless and Mobile Computing, Networking and Communications, WiMob 2025
Country/TerritoryMorocco
CityMarrakesh
Period20/10/2522/10/25

Keywords

  • Deep Reinforcement Learning
  • Edge-Cloud Orchestration
  • Large Language Models (LLMs)
  • Microservice Deployment
  • Retrieval-Augmented Generation (RAG)

Cite this