TY - GEN
T1 - Toward Multi-Agent Coordination in IoT via Prompt Pool-based Continual Reinforcement Learning
AU - Xu, Chenhang
AU - Wang, Jia
AU - Zhu, Xiaohui
AU - Yue, Yong
AU - Qi, Jun
AU - Ma, Jieming
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The Internet of Things (IoT) represents a complex, dynamic environment where edge devices continuously optimize their policies to address a continual stream of tasks. Previous studies have typically relied on a rehearsal buffer containing data from past tasks or a known task identity to mitigate catastrophic forgetting. Our research, Prompt Pool-based Continual Reinforcement Learning (PPCRL), aims to create a more efficient memory system by expanding a single prompt into a prompt pool, allowing agents to automatically select a set of relevant prompts without needing task identity knowledge. Similar to prompt-based learning techniques, our approach utilizes a small trainable prompt pool to guide pre-trained models through sequential task learning systematically. This allows us to optimize prompts for guiding model predictions and effectively manage both shared and task-specific knowledge while maintaining model generalization. We conducted experiments on two multi-agent benchmarks where traditional methods suffer from significant performance degradation. In contrast, PPCRL demonstrates the capability to outperform baselines and exhibits high generalization ability.
AB - The Internet of Things (IoT) represents a complex, dynamic environment where edge devices continuously optimize their policies to address a continual stream of tasks. Previous studies have typically relied on a rehearsal buffer containing data from past tasks or a known task identity to mitigate catastrophic forgetting. Our research, Prompt Pool-based Continual Reinforcement Learning (PPCRL), aims to create a more efficient memory system by expanding a single prompt into a prompt pool, allowing agents to automatically select a set of relevant prompts without needing task identity knowledge. Similar to prompt-based learning techniques, our approach utilizes a small trainable prompt pool to guide pre-trained models through sequential task learning systematically. This allows us to optimize prompts for guiding model predictions and effectively manage both shared and task-specific knowledge while maintaining model generalization. We conducted experiments on two multi-agent benchmarks where traditional methods suffer from significant performance degradation. In contrast, PPCRL demonstrates the capability to outperform baselines and exhibits high generalization ability.
KW - Continual reinforcement learning
KW - IoT
KW - Prompt pool
KW - Prompt-based learning
UR - http://www.scopus.com/inward/record.url?scp=105000173983&partnerID=8YFLogxK
U2 - 10.1109/ISPA63168.2024.00296
DO - 10.1109/ISPA63168.2024.00296
M3 - Conference Proceeding
AN - SCOPUS:105000173983
T3 - Proceedings - 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2024
SP - 2170
EP - 2177
BT - Proceedings - 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2024
Y2 - 30 October 2024 through 2 November 2024
ER -