Abstract
It is challenging to train an efficient learning procedure with multiagent reinforcement learning (MARL) when the number of agents increases as the observation space exponentially expands, especially in large-scale multiagent systems. In this article, we proposed a scalable attentive transfer framework (SATF) for efficient MARL, which achieved goals faster and more accurately in homogeneous and heterogeneous combat tasks by transferring learned knowledge from a small number of agents (4) to a large number of agents (up to 64). To reduce and align the dimensionality of the observed state variations caused by increasing numbers of agents, the proposed SATF deployed a novel state representation network with a self-attention mechanism, known as dynamic observation representation network (DorNet), to extract the dominant observed information with excellent cost-effectiveness. The experiments on the <italic>MAgent</italic> platform showed that the SATF outperformed the distributed MARL (independent Q-learning (IQL) and A2C) in task sequences from 8 to 64 agents. The experiments on <italic>StarCraft II</italic> showed that the SATF demonstrated superior performance relative to the centralized training with decentralized execution MARL (QMIX) by presenting shorter training steps, achieving a desired win rate of up to approximately 90% when increasing the number of agents from 4 to 32. The findings of our study showed the great potential for enhancing the efficiency of MARL training in large-scale agent combat missions.
Original language | English |
---|---|
Pages (from-to) | 1-15 |
Number of pages | 15 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
DOIs | |
Publication status | Accepted/In press - 2024 |
Externally published | Yes |
Keywords
- Australia
- Knowledge transfer
- Multiagent reinforcement learning (MARL)
- observation representation
- Scalability
- Standards
- Task analysis
- Training
- training efficiency
- transfer learning
- Transfer learning