Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning

Fangzhou Xiong; Zhiyong Liu; Kaizhu Huang; Xu Yang; Hong Qiao; Amir Hussain

doi:10.1016/j.neunet.2020.06.003

Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning

Fangzhou Xiong, Zhiyong Liu^*, Kaizhu Huang, Xu Yang, Hong Qiao, Amir Hussain

^*Corresponding author for this work

Xi'an Jiaotong-Liverpool University

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

Abstract

Continual learning, a widespread ability in people and animals, aims to learn and acquire new knowledge and skills continuously. Catastrophic forgetting usually occurs in continual learning when an agent attempts to learn different tasks sequentially without storing or accessing previous task information. Unfortunately, current learning systems, e.g., neural networks, are prone to deviate the weights learned in previous tasks after training new tasks, leading to catastrophic forgetting, especially in a sequential multi-tasks scenario. To address this problem, in this paper, we propose to overcome catastrophic forgetting with the focus on learning a series of robotic tasks sequentially. Particularly, a novel hierarchical neural network's framework called Encoding Primitives Generation Policy Learning (E-PGPL) is developed to enable continual learning with two components. By employing a variational autoencoder to project the original state space into a meaningful low-dimensional feature space, representative state primitives could be sampled to help learn corresponding policies for different tasks. In learning a new task, the feature space is required to be close to the previous ones so that previously learned tasks can be protected. Extensive experiments on several simulated robotic tasks demonstrate our method's efficacy to learn control policies for handling sequentially arriving multi-tasks, delivering improvement substantially over some other continual learning methods, especially for the tasks with more diversity.

Original language	English
Pages (from-to)	163-173
Number of pages	11
Journal	Neural Networks
Volume	129
DOIs	https://doi.org/10.1016/j.neunet.2020.06.003
Publication status	Published - Sept 2020

Keywords

Catastrophic forgetting
Continual learning
Robotics
Sequential multi-tasks learning

Access to Document

10.1016/j.neunet.2020.06.003

Cite this

@article{3030c8d7a8d340e9ab7b0efd50142880,

title = "Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning",

abstract = "Continual learning, a widespread ability in people and animals, aims to learn and acquire new knowledge and skills continuously. Catastrophic forgetting usually occurs in continual learning when an agent attempts to learn different tasks sequentially without storing or accessing previous task information. Unfortunately, current learning systems, e.g., neural networks, are prone to deviate the weights learned in previous tasks after training new tasks, leading to catastrophic forgetting, especially in a sequential multi-tasks scenario. To address this problem, in this paper, we propose to overcome catastrophic forgetting with the focus on learning a series of robotic tasks sequentially. Particularly, a novel hierarchical neural network's framework called Encoding Primitives Generation Policy Learning (E-PGPL) is developed to enable continual learning with two components. By employing a variational autoencoder to project the original state space into a meaningful low-dimensional feature space, representative state primitives could be sampled to help learn corresponding policies for different tasks. In learning a new task, the feature space is required to be close to the previous ones so that previously learned tasks can be protected. Extensive experiments on several simulated robotic tasks demonstrate our method's efficacy to learn control policies for handling sequentially arriving multi-tasks, delivering improvement substantially over some other continual learning methods, especially for the tasks with more diversity.",

keywords = "Catastrophic forgetting, Continual learning, Robotics, Sequential multi-tasks learning",

author = "Fangzhou Xiong and Zhiyong Liu and Kaizhu Huang and Xu Yang and Hong Qiao and Amir Hussain",

note = "Funding Information: The authors are grateful to the anonymous reviewers for their insightful comments and suggestions, which helped improve the quality of this paper. This work is supported by National Key Research and Development Plan of China grant 2017YFB1300202 , NSFC, China grants U1613213 , 61375005 , 61503383 , 61210009 , 61876155 , the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100 , Key Program Special Fund in XJTLU, China ( KSF-A-01 , KSF-T-06 , KSF-E-26 , KSF-P-02 and KSF-A-10 ), Natural Science Foundation of Jiangsu Province, China BK20181189 , and the UK Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/M026981/1 . Funding Information: The authors are grateful to the anonymous reviewers for their insightful comments and suggestions, which helped improve the quality of this paper. This work is supported by National Key Research and Development Plan of China grant 2017YFB1300202, NSFC, China grants U1613213, 61375005, 61503383, 61210009, 61876155, the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100, Key Program Special Fund in XJTLU, China (KSF-A-01, KSF-T-06, KSF-E-26, KSF-P-02 and KSF-A-10), Natural Science Foundation of Jiangsu Province, ChinaBK20181189, and the UK Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/M026981/1. Publisher Copyright: {\textcopyright} 2020 Elsevier Ltd",

year = "2020",

month = sep,

doi = "10.1016/j.neunet.2020.06.003",

language = "English",

volume = "129",

pages = "163--173",

journal = "Neural Networks",

issn = "0893-6080",

}

TY - JOUR

T1 - Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning

AU - Xiong, Fangzhou

AU - Liu, Zhiyong

AU - Huang, Kaizhu

AU - Yang, Xu

AU - Qiao, Hong

AU - Hussain, Amir

N1 - Funding Information: The authors are grateful to the anonymous reviewers for their insightful comments and suggestions, which helped improve the quality of this paper. This work is supported by National Key Research and Development Plan of China grant 2017YFB1300202 , NSFC, China grants U1613213 , 61375005 , 61503383 , 61210009 , 61876155 , the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100 , Key Program Special Fund in XJTLU, China ( KSF-A-01 , KSF-T-06 , KSF-E-26 , KSF-P-02 and KSF-A-10 ), Natural Science Foundation of Jiangsu Province, China BK20181189 , and the UK Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/M026981/1 . Funding Information: The authors are grateful to the anonymous reviewers for their insightful comments and suggestions, which helped improve the quality of this paper. This work is supported by National Key Research and Development Plan of China grant 2017YFB1300202, NSFC, China grants U1613213, 61375005, 61503383, 61210009, 61876155, the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100, Key Program Special Fund in XJTLU, China (KSF-A-01, KSF-T-06, KSF-E-26, KSF-P-02 and KSF-A-10), Natural Science Foundation of Jiangsu Province, ChinaBK20181189, and the UK Engineering and Physical Sciences Research Council (EPSRC) Grant No. EP/M026981/1. Publisher Copyright: © 2020 Elsevier Ltd

PY - 2020/9

Y1 - 2020/9

N2 - Continual learning, a widespread ability in people and animals, aims to learn and acquire new knowledge and skills continuously. Catastrophic forgetting usually occurs in continual learning when an agent attempts to learn different tasks sequentially without storing or accessing previous task information. Unfortunately, current learning systems, e.g., neural networks, are prone to deviate the weights learned in previous tasks after training new tasks, leading to catastrophic forgetting, especially in a sequential multi-tasks scenario. To address this problem, in this paper, we propose to overcome catastrophic forgetting with the focus on learning a series of robotic tasks sequentially. Particularly, a novel hierarchical neural network's framework called Encoding Primitives Generation Policy Learning (E-PGPL) is developed to enable continual learning with two components. By employing a variational autoencoder to project the original state space into a meaningful low-dimensional feature space, representative state primitives could be sampled to help learn corresponding policies for different tasks. In learning a new task, the feature space is required to be close to the previous ones so that previously learned tasks can be protected. Extensive experiments on several simulated robotic tasks demonstrate our method's efficacy to learn control policies for handling sequentially arriving multi-tasks, delivering improvement substantially over some other continual learning methods, especially for the tasks with more diversity.

AB - Continual learning, a widespread ability in people and animals, aims to learn and acquire new knowledge and skills continuously. Catastrophic forgetting usually occurs in continual learning when an agent attempts to learn different tasks sequentially without storing or accessing previous task information. Unfortunately, current learning systems, e.g., neural networks, are prone to deviate the weights learned in previous tasks after training new tasks, leading to catastrophic forgetting, especially in a sequential multi-tasks scenario. To address this problem, in this paper, we propose to overcome catastrophic forgetting with the focus on learning a series of robotic tasks sequentially. Particularly, a novel hierarchical neural network's framework called Encoding Primitives Generation Policy Learning (E-PGPL) is developed to enable continual learning with two components. By employing a variational autoencoder to project the original state space into a meaningful low-dimensional feature space, representative state primitives could be sampled to help learn corresponding policies for different tasks. In learning a new task, the feature space is required to be close to the previous ones so that previously learned tasks can be protected. Extensive experiments on several simulated robotic tasks demonstrate our method's efficacy to learn control policies for handling sequentially arriving multi-tasks, delivering improvement substantially over some other continual learning methods, especially for the tasks with more diversity.

KW - Catastrophic forgetting

KW - Continual learning

KW - Robotics

KW - Sequential multi-tasks learning

UR - http://www.scopus.com/inward/record.url?scp=85086474265&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2020.06.003

DO - 10.1016/j.neunet.2020.06.003

M3 - Article

C2 - 32535306

AN - SCOPUS:85086474265

SN - 0893-6080

VL - 129

SP - 163

EP - 173

JO - Neural Networks

JF - Neural Networks

ER -

Encoding primitives generation policy learning for robotic arm to overcome catastrophic forgetting in sequential multi-tasks learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this