Abstract
Reinforcement learning algorithms have been used to discover the strategies in game theory. This study investigates whether Q learning, one of the classic reinforcement learning methods, is capable of training bargaining players via self-play, a training paradigm used by AlphaGo, to maximum their profit. We also compare our empirical results with the known theoretic solutions and perform an comprehensive analysis upon their differences. To accomplish these, we come up with two policy updating methods used in the training process, namely alternate update and simultaneous update, which are tailored for two players who propose offers and counter-offers in an alternating manner under a time constraint enforced by the discount factors. Our experimental results have demonstrated that the values of the discount factor actually have tangible impact on how far the bargaining outcomes deviate from the game theoretic solutions.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2023 5th International Conference on Pattern Recognition and Intelligent Systems, PRIS 2023 |
| Editors | Wenbing Zhao, Xinguo Yu |
| Publisher | Association for Computing Machinery |
| Pages | 51-58 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781450399968 |
| DOIs | |
| Publication status | Published - 28 Jul 2023 |
| Event | 5th International Conference on Pattern Recognition and Intelligent Systems, PRIS 2023 - Virtual, Online Duration: 29 Jul 2023 → … |
Publication series
| Name | ACM International Conference Proceeding Series |
|---|
Conference
| Conference | 5th International Conference on Pattern Recognition and Intelligent Systems, PRIS 2023 |
|---|---|
| City | Virtual, Online |
| Period | 29/07/23 → … |
Keywords
- bargaining game
- Q learning
- self-play
Fingerprint
Dive into the research topics of 'Policy Updating Methods of Q Learning for Two Player Bargaining Game'. Together they form a unique fingerprint.Projects
- 1 Active
-
Reinforcement Learning Algorithms for Brain-Robot interaction
Jin, N. (PI)
1/01/24 → 31/12/26
Project: Internal Research Project
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver