Balancing State Exploration and Skill Diversity in Unsupervised Skill Discovery

Xin Liu, Yaran Chen, Guixing Chen, Haoran Li, Dongbin Zhao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Unsupervised skill discovery seeks to acquire different useful skills without extrinsic reward via unsupervised reinforcement learning (RL), with the discovered skills efficiently adapting to multiple downstream tasks in various ways. However, recent advanced skill discovery methods struggle to well balance state exploration and skill diversity, particularly when the potential skills are rich and hard to discern. In this article, we propose contrastive dynamic skill discovery (ComSD) which generates diverse and exploratory unsupervised skills through a novel intrinsic incentive, named contrastive dynamic reward. It contains a particle-based exploration reward to make agents access far-reaching states for exploratory skill acquisition, and a novel contrastive diversity reward to promote the discriminability between different skills. Moreover, a novel dynamic weighting mechanism between the above two rewards is proposed to balance state exploration and skill diversity, which further enhances the quality of the discovered skills. Extensive experiments and analysis demonstrate that ComSD can generate diverse behaviors at different exploratory levels for multijoint robots, enabling state-of-the-art adaptation performance on challenging downstream tasks. It can also discover distinguishable and far-reaching exploration skills in the challenging tree-like 2-D maze.

Original languageEnglish
Pages (from-to)2234-2247
Number of pages14
JournalIEEE Transactions on Cybernetics
Volume55
Issue number5
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • Contrastive learning (CL)
  • deep reinforcement learning (DRL)
  • exploration and exploitation
  • multitask adaptation
  • reinforcement learning (RL) pretraining
  • skill discovery
  • unsupervised RL

Fingerprint

Dive into the research topics of 'Balancing State Exploration and Skill Diversity in Unsupervised Skill Discovery'. Together they form a unique fingerprint.

Cite this