Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay

Tianhong Dai; Hengyan Liu; Kai Arulkumaran; Guangyu Ren; Anil Anthony Bharath

doi:10.1007/978-3-030-89370-5_3

Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay

Tianhong Dai, Hengyan Liu, Kai Arulkumaran, Guangyu Ren, Anil Anthony Bharath

School of AI and Advanced Computing

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

10 Citations (Scopus)

Abstract

Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

Original language	English
Title of host publication	PRICAI 2021
Subtitle of host publication	Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings
Editors	Duc Nghia Pham, Thanaruk Theeramunkong, Guido Governatori, Fenrong Liu
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	32-45
Number of pages	14
ISBN (Print)	9783030893699
DOIs	https://doi.org/10.1007/978-3-030-89370-5_3
Publication status	Published - 2021
Event	18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021 - Virtual, Online Duration: 8 Nov 2021 → 12 Nov 2021

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13033 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021
City	Virtual, Online
Period	8/11/21 → 12/11/21

Keywords

Deep reinforcement learning
Determinantal point processes
Hindsight experience replay

Access to Document

10.1007/978-3-030-89370-5_3

Cite this

Dai, T., Liu, H., Arulkumaran, K., Ren, G., & Bharath, A. A. (2021). Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. In D. N. Pham, T. Theeramunkong, G. Governatori, & F. Liu (Eds.), PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings (pp. 32-45). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13033 LNAI). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-89370-5_3

Dai, Tianhong ; Liu, Hengyan ; Arulkumaran, Kai et al. / Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings. editor / Duc Nghia Pham ; Thanaruk Theeramunkong ; Guido Governatori ; Fenrong Liu. Springer Science and Business Media Deutschland GmbH, 2021. pp. 32-45 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{74d674d858ad4ccdbedd6c2d432d92c1,

title = "Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay",

abstract = "Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent{\textquoteright}s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.",

keywords = "Deep reinforcement learning, Determinantal point processes, Hindsight experience replay",

author = "Tianhong Dai and Hengyan Liu and Kai Arulkumaran and Guangyu Ren and Bharath, {Anil Anthony}",

note = "Publisher Copyright: {\textcopyright} 2021, Springer Nature Switzerland AG.; 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021 ; Conference date: 08-11-2021 Through 12-11-2021",

year = "2021",

doi = "10.1007/978-3-030-89370-5_3",

language = "English",

isbn = "9783030893699",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "32--45",

editor = "Pham, {Duc Nghia} and Thanaruk Theeramunkong and Guido Governatori and Fenrong Liu",

booktitle = "PRICAI 2021",

}

Dai, T, Liu, H, Arulkumaran, K, Ren, G & Bharath, AA 2021, Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. in DN Pham, T Theeramunkong, G Governatori & F Liu (eds), PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13033 LNAI, Springer Science and Business Media Deutschland GmbH, pp. 32-45, 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Virtual, Online, 8/11/21. https://doi.org/10.1007/978-3-030-89370-5_3

Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. / Dai, Tianhong; Liu, Hengyan; Arulkumaran, Kai et al.
PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings. ed. / Duc Nghia Pham; Thanaruk Theeramunkong; Guido Governatori; Fenrong Liu. Springer Science and Business Media Deutschland GmbH, 2021. p. 32-45 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13033 LNAI).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay

AU - Dai, Tianhong

AU - Liu, Hengyan

AU - Arulkumaran, Kai

AU - Ren, Guangyu

AU - Bharath, Anil Anthony

PY - 2021

Y1 - 2021

N2 - Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

AB - Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent’s experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

KW - Deep reinforcement learning

KW - Determinantal point processes

KW - Hindsight experience replay

UR - http://www.scopus.com/inward/record.url?scp=85119277258&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-89370-5_3

DO - 10.1007/978-3-030-89370-5_3

M3 - Conference Proceeding

AN - SCOPUS:85119277258

SN - 9783030893699

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 32

EP - 45

BT - PRICAI 2021

A2 - Pham, Duc Nghia

A2 - Theeramunkong, Thanaruk

A2 - Governatori, Guido

A2 - Liu, Fenrong

PB - Springer Science and Business Media Deutschland GmbH

T2 - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021

Y2 - 8 November 2021 through 12 November 2021

ER -

Dai T, Liu H, Arulkumaran K, Ren G, Bharath AA. Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay. In Pham DN, Theeramunkong T, Governatori G, Liu F, editors, PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. p. 32-45. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-89370-5_3

Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this