GP-PAIL: Generative Adversarial Imitation Learning in Massive-Agent Environments

Yulong Li, Boqian Wang, Jionglong Su*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Traditional multi-agent reinforcement learning algorithms are unsuitable for massive-agent environments where the problems of credit allocation, dense reward function design and stage course design become pronounced. Since massive-agent environments can only give sparse rewards to the agents, these algorithms face difficulty in learning effective actions. While specially designed dense reward functions can help the agents to obtain more reward signals, the algorithms face a trade-off between convergence speed and generalization ability. Although a hand-crafted reward function can provide frequent feedback that accelerates the learning of the agents, excessively detailed rewards may cause the agents to focus on short-term rewards and overlook long-term goals, resulting in a sub-optimal strategy. To address this, we propose GP-PAIL (Generative Pixel-to-Pixel Adversarial Imitation Learning), a novel generative adversarial imitation learning algorithm that uses a pixel-to-pixel policy structure for centralized control. It mitigates the issues of fixed behavioral patterns and credit allocation inherent in the specially designed of dense reward functions and staged curricula, enhancing imitation learning in massive-agent environments. Experimental results demonstrate the efficacy of GP-PAIL, with a 92% win rate compared to the best12 algorithm. Furthermore, in terms of early skill learning speed, it improves nearly 3.25 times faster on number of episodes compared to the current state-of-the-art best32 algorithm.

Original languageEnglish
Title of host publication2024 IEEE 7th International Conference on Big Data and Artificial Intelligence, BDAI 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages314-322
Number of pages9
ISBN (Electronic)9798350352009
DOIs
Publication statusPublished - 2024
Event7th IEEE International Conference on Big Data and Artificial Intelligence, BDAI 2024 - Beijing, China
Duration: 5 Jul 20247 Jul 2024

Publication series

Name2024 IEEE 7th International Conference on Big Data and Artificial Intelligence, BDAI 2024

Conference

Conference7th IEEE International Conference on Big Data and Artificial Intelligence, BDAI 2024
Country/TerritoryChina
CityBeijing
Period5/07/247/07/24

Keywords

  • Behavioral Cloning
  • Generative Adversarial Imitation Learning
  • LUX
  • Massive-Agent Reinforcement Learning Environments
  • Pixel-To-Pixel Policy Architecture

Fingerprint

Dive into the research topics of 'GP-PAIL: Generative Adversarial Imitation Learning in Massive-Agent Environments'. Together they form a unique fingerprint.

Cite this