TY - GEN
T1 - CoDe
T2 - 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024
AU - Huang, Shuangyao
AU - Zhang, Haibo
AU - Huang, Zhiyi
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This paper introduces a cooperative and decentralized collision avoidance algorithm (CoDe) for small-scale UAV swarms consisting of up to three UAVs. CoDe improves energy efficiency of UAVs by achieving effective cooperation among UAVs. Moreover, CoDe is specifically tailored for UAV's operations by addressing the challenges faced by existing schemes, such as ineffectiveness in selecting actions from continuous action spaces and high computational complexity. CoDe is based on Multi-Agent Reinforcement Learning (MARL), and finds cooperative policies by incorporating a novel credit assignment scheme. The novel credit assignment scheme estimates the contribution of an individual by subtracting a baseline from the joint action value for the swarm. The credit assignment scheme in CoDe outperforms other benchmarks as the baseline takes into account not only the importance of a UAV's action but also the interrelation between UAVs. Furthermore, extensive experiments are conducted against existing MARL-based and conventional heuristic-based algorithms to demonstrate the advantages of the proposed algorithm.
AB - This paper introduces a cooperative and decentralized collision avoidance algorithm (CoDe) for small-scale UAV swarms consisting of up to three UAVs. CoDe improves energy efficiency of UAVs by achieving effective cooperation among UAVs. Moreover, CoDe is specifically tailored for UAV's operations by addressing the challenges faced by existing schemes, such as ineffectiveness in selecting actions from continuous action spaces and high computational complexity. CoDe is based on Multi-Agent Reinforcement Learning (MARL), and finds cooperative policies by incorporating a novel credit assignment scheme. The novel credit assignment scheme estimates the contribution of an individual by subtracting a baseline from the joint action value for the swarm. The credit assignment scheme in CoDe outperforms other benchmarks as the baseline takes into account not only the importance of a UAV's action but also the interrelation between UAVs. Furthermore, extensive experiments are conducted against existing MARL-based and conventional heuristic-based algorithms to demonstrate the advantages of the proposed algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85216492470&partnerID=8YFLogxK
U2 - 10.1109/IROS58592.2024.10802312
DO - 10.1109/IROS58592.2024.10802312
M3 - Conference Proceeding
AN - SCOPUS:85216492470
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 13152
EP - 13159
BT - 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 October 2024 through 18 October 2024
ER -