TY - GEN
T1 - DouRN
T2 - 15th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023
AU - Chen, Yiquan
AU - Lyu, Yingchao
AU - Zhang, Di
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep reinforcement learning has made significant progress in games with imperfect information, but its performance in the card game Doudizhu (Chinese Poker/Fight the Landlord) remains unsatisfactory. Doudizhu is different from conventional games as it involves three players and combines elements of cooperation and confrontation, resulting in a large state and action space. In 2021, a Doudizhu program called DouZero [8] surpassed previous models without prior knowledge by utilizing traditional Monte Carlo methods and multilayer perceptrons. Building on this work, our study incorporates residual networks into the model, explores different architectural designs, and conducts multi-role testing. Our findings demonstrate that this model significantly improves the winning rate within the same training time. Additionally, we introduce a call scoring system to assist the agent in deciding whether to become a landlord. With these enhancements, our model consistently outperforms the existing version of DouZero and even experienced human players.11The source code is available at https://github.com/Yingchaol/Douzero_Resnet.git.
AB - Deep reinforcement learning has made significant progress in games with imperfect information, but its performance in the card game Doudizhu (Chinese Poker/Fight the Landlord) remains unsatisfactory. Doudizhu is different from conventional games as it involves three players and combines elements of cooperation and confrontation, resulting in a large state and action space. In 2021, a Doudizhu program called DouZero [8] surpassed previous models without prior knowledge by utilizing traditional Monte Carlo methods and multilayer perceptrons. Building on this work, our study incorporates residual networks into the model, explores different architectural designs, and conducts multi-role testing. Our findings demonstrate that this model significantly improves the winning rate within the same training time. Additionally, we introduce a call scoring system to assist the agent in deciding whether to become a landlord. With these enhancements, our model consistently outperforms the existing version of DouZero and even experienced human players.11The source code is available at https://github.com/Yingchaol/Douzero_Resnet.git.
KW - DouDizhu
KW - Monte Carlo Methods
KW - Reinforcement Learning
KW - Residual Neural Networks
UR - http://www.scopus.com/inward/record.url?scp=85186768696&partnerID=8YFLogxK
U2 - 10.1109/CyberC58899.2023.00026
DO - 10.1109/CyberC58899.2023.00026
M3 - Conference Proceeding
AN - SCOPUS:85186768696
T3 - Proceedings - 2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023
SP - 96
EP - 99
BT - Proceedings - 2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 November 2023 through 4 November 2023
ER -