TY - GEN
T1 - Convolutional fitted Q iteration for vision-based control problems
AU - Zhao, Dongbin
AU - Zhu, Yuanheng
AU - Lv, Le
AU - Chen, Yaran
AU - Zhang, Qichao
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/10/31
Y1 - 2016/10/31
N2 - In this paper a deep reinforcement learning (DRL) method is proposed to solve the control problem which takes raw image pixels as input states. A convolutional neural network (CNN) is used to approximate Q functions, termed as Q-CNN. A pretrained network, which is the result of a classification challenge on a vast set of natural images, initializes the parameters of Q-CNN. Such initialization assigns Q-CNN with the features of image representation, so it is more concentrated on the control tasks. The weights are tuned under the scheme of fitted Q iteration (FQI), which is an offline reinforcement learning method with the stable convergence property. To demonstrate the performance, a modified Food-Poison problem is simulated. The agent determines its movements based on its forward view. In the end the algorithm successfully learns a satisfied policy which has better performance than the results of previous researches.
AB - In this paper a deep reinforcement learning (DRL) method is proposed to solve the control problem which takes raw image pixels as input states. A convolutional neural network (CNN) is used to approximate Q functions, termed as Q-CNN. A pretrained network, which is the result of a classification challenge on a vast set of natural images, initializes the parameters of Q-CNN. Such initialization assigns Q-CNN with the features of image representation, so it is more concentrated on the control tasks. The weights are tuned under the scheme of fitted Q iteration (FQI), which is an offline reinforcement learning method with the stable convergence property. To demonstrate the performance, a modified Food-Poison problem is simulated. The agent determines its movements based on its forward view. In the end the algorithm successfully learns a satisfied policy which has better performance than the results of previous researches.
KW - Convolutional neural network
KW - Deep reinforcement learning
KW - Fitted Q iteration
KW - Vision-based control
UR - http://www.scopus.com/inward/record.url?scp=85007275358&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2016.7727794
DO - 10.1109/IJCNN.2016.7727794
M3 - Conference Proceeding
AN - SCOPUS:85007275358
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 4539
EP - 4544
BT - 2016 International Joint Conference on Neural Networks, IJCNN 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 International Joint Conference on Neural Networks, IJCNN 2016
Y2 - 24 July 2016 through 29 July 2016
ER -