Review of deep reinforcement learning and discussions on the development of computer Go

Dong Bin Zhao; Kun Shao; Yuan Heng Zhu; Dong Li; Ya Ran Chen; Hai Tao Wang; De Rong Liu; Tong Zhou; Cheng Hong Wang

doi:10.7641/CTA.2016.60173

Review of deep reinforcement learning and discussions on the development of computer Go

Dong Bin Zhao^*, Kun Shao, Yuan Heng Zhu, Dong Li, Ya Ran Chen, Hai Tao Wang, De Rong Liu, Tong Zhou, Cheng Hong Wang

^*Corresponding author for this work

Research output: Contribution to journal › Review article › peer-review

112 Citations (Scopus)

Abstract

Deep reinforcement learning which incorporates both the advantages of the perception of deep learning and the decision making of reinforcement learning is able to output control signal directly based on input images. This mechanism makes the artificial intelligence much close to human thinking modes. Deep reinforcement learning has achieved remarkable success in terms of theory and application since it is proposed. 'Chuyihao-AlphaGo', a computer Go developed by Google DeepMind, based on deep reinforcement learning, beat the world's top Go player Lee Sedol 4:1 in March 2016. This becomes a new milestone in artificial intelligence history. This paper surveys the development course of deep reinforcement learning, reviews the history of computer Go concurrently, analyzes the algorithms features, and discusses the research directions and application areas, in order to provide a valuable reference to the development of control theory and applications in a new direction.

Original language	English
Pages (from-to)	701-717
Number of pages	17
Journal	Kongzhi Lilun Yu Yingyong/Control Theory and Applications
Volume	33
Issue number	6
DOIs	https://doi.org/10.7641/CTA.2016.60173
Publication status	Published - 1 Jun 2016
Externally published	Yes

Keywords

AlphaGo
Artificial intelligence
Deep learning
Deep reinforcement learning
Reinforcement learning

Access to Document

10.7641/CTA.2016.60173

Cite this

@article{be12b876cd9e4854a462d50589ccfa45,

title = "Review of deep reinforcement learning and discussions on the development of computer Go",

abstract = "Deep reinforcement learning which incorporates both the advantages of the perception of deep learning and the decision making of reinforcement learning is able to output control signal directly based on input images. This mechanism makes the artificial intelligence much close to human thinking modes. Deep reinforcement learning has achieved remarkable success in terms of theory and application since it is proposed. 'Chuyihao-AlphaGo', a computer Go developed by Google DeepMind, based on deep reinforcement learning, beat the world's top Go player Lee Sedol 4:1 in March 2016. This becomes a new milestone in artificial intelligence history. This paper surveys the development course of deep reinforcement learning, reviews the history of computer Go concurrently, analyzes the algorithms features, and discusses the research directions and application areas, in order to provide a valuable reference to the development of control theory and applications in a new direction.",

keywords = "AlphaGo, Artificial intelligence, Deep learning, Deep reinforcement learning, Reinforcement learning",

author = "Zhao, {Dong Bin} and Kun Shao and Zhu, {Yuan Heng} and Dong Li and Chen, {Ya Ran} and Wang, {Hai Tao} and Liu, {De Rong} and Tong Zhou and Wang, {Cheng Hong}",

year = "2016",

month = jun,

day = "1",

doi = "10.7641/CTA.2016.60173",

language = "English",

volume = "33",

pages = "701--717",

journal = "Kongzhi Lilun Yu Yingyong/Control Theory and Applications",

issn = "1000-8152",

number = "6",

}

TY - JOUR

T1 - Review of deep reinforcement learning and discussions on the development of computer Go

AU - Zhao, Dong Bin

AU - Shao, Kun

AU - Zhu, Yuan Heng

AU - Li, Dong

AU - Chen, Ya Ran

AU - Wang, Hai Tao

AU - Liu, De Rong

AU - Zhou, Tong

AU - Wang, Cheng Hong

PY - 2016/6/1

Y1 - 2016/6/1

N2 - Deep reinforcement learning which incorporates both the advantages of the perception of deep learning and the decision making of reinforcement learning is able to output control signal directly based on input images. This mechanism makes the artificial intelligence much close to human thinking modes. Deep reinforcement learning has achieved remarkable success in terms of theory and application since it is proposed. 'Chuyihao-AlphaGo', a computer Go developed by Google DeepMind, based on deep reinforcement learning, beat the world's top Go player Lee Sedol 4:1 in March 2016. This becomes a new milestone in artificial intelligence history. This paper surveys the development course of deep reinforcement learning, reviews the history of computer Go concurrently, analyzes the algorithms features, and discusses the research directions and application areas, in order to provide a valuable reference to the development of control theory and applications in a new direction.

AB - Deep reinforcement learning which incorporates both the advantages of the perception of deep learning and the decision making of reinforcement learning is able to output control signal directly based on input images. This mechanism makes the artificial intelligence much close to human thinking modes. Deep reinforcement learning has achieved remarkable success in terms of theory and application since it is proposed. 'Chuyihao-AlphaGo', a computer Go developed by Google DeepMind, based on deep reinforcement learning, beat the world's top Go player Lee Sedol 4:1 in March 2016. This becomes a new milestone in artificial intelligence history. This paper surveys the development course of deep reinforcement learning, reviews the history of computer Go concurrently, analyzes the algorithms features, and discusses the research directions and application areas, in order to provide a valuable reference to the development of control theory and applications in a new direction.

KW - AlphaGo

KW - Artificial intelligence

KW - Deep learning

KW - Deep reinforcement learning

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=84979285126&partnerID=8YFLogxK

U2 - 10.7641/CTA.2016.60173

DO - 10.7641/CTA.2016.60173

M3 - Review article

AN - SCOPUS:84979285126

SN - 1000-8152

VL - 33

SP - 701

EP - 717

JO - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

JF - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

IS - 6

ER -

Review of deep reinforcement learning and discussions on the development of computer Go

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this