Sample-Efficient Unsupervised Policy Cloning from Ensemble Self-Supervised Labeled Videos

Xin Liu, Yaran Chen*, Haoran Li*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Current advanced policy learning methodologies have demonstrated the ability to develop expert-level strategies when provided enough information. However, their requirements, including task-specific rewards, action-labeled expert trajectories, and huge environmental interactions, can be expensive or even unavailable in many scenarios. In contrast, humans can efficiently acquire skills within a few trials and errors by imitating easily accessible internet videos, in the absence of any other supervision. In this paper, we try to let machines replicate this efficient watching-and-learning process through Unsupervised Policy from Ensemble Self-supervised labeled Videos (UPESV), a novel framework to efficiently learn policies from action-free videos without rewards and any other expert supervision. UPESV trains a video labeling model to infer the expert actions in expert videos through several organically combined self-supervised tasks. Each task performs its duties, and they together enable the model to make full use of both action-free videos and reward-free interactions for robust dynamics understanding and advanced action prediction. Simultaneously, UPESV clones a policy from the labeled expert videos, in turn collecting environmental interactions for self-supervised tasks. After a sample-efficient, unsupervised, and iterative training process, UPESV obtains an advanced policy based on a robust video labeling model. Extensive experiments in sixteen challenging procedurally generated environments demonstrate that the proposed UPESV achieves state-of-the-art interaction-limited policy learning performance (outperforming five current advanced baselines on 12/16 tasks) without exposure to any other supervision except for videos.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Robotics and Automation, ICRA 2025
EditorsChristian Ott, Henny Admoni, Sven Behnke, Stjepan Bogdan, Aude Bolopion, Youngjin Choi, Fanny Ficuciello, Nicholas Gans, Clement Gosselin, Kensuke Harada, Erdal Kayacan, H. Jin Kim, Stefan Leutenegger, Zhe Liu, Perla Maiolino, Lino Marques, Takamitsu Matsubara, Anastasia Mavromatti, Mark Minor, Jason O'Kane, Hae Won Park, Hae-Won Park, Ioannis Rekleitis, Federico Renda, Elisa Ricci, Laurel D. Riek, Lorenzo Sabattini, Shaojie Shen, Yu Sun, Pierre-Brice Wieber, Katsu Yamane, Jingjin Yu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3632-3639
Number of pages8
ISBN (Electronic)9798331541392
DOIs
Publication statusPublished - 2025
Event2025 IEEE International Conference on Robotics and Automation, ICRA 2025 - Atlanta, United States
Duration: 19 May 202523 May 2025

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2025 IEEE International Conference on Robotics and Automation, ICRA 2025
Country/TerritoryUnited States
CityAtlanta
Period19/05/2523/05/25

Cite this