Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3

Dexuan Li*, Nanlin Jin

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Reinforcement Learning (RL) is becoming popular for two-legged robots to learn and improve, through trial and error to adjust their actions based on feedback. Deep RL, combining RL with deep learning, handles high-dimensional states and action spaces in robotics. Two important Deep RL algorithms are Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). Our proposed new algorithm that improves TD3 involves to maximize cumulative rewards by interacting with the environment. The robot learns a policy through exploration and exploitation. This paper investigates the scenario of robots continuously walk without falling. Our experiments show that the two legged robot can automatously adapt to the environment by themselves, without human solving the problems for them. DDPG shows promise but suffers from instability and hyperparameter sensitivity. Our improved TD3 mitigates DDPG’s overestimation bias, improving stability and performance. This study also evaluates stability, convergence, and computational efficiency of DDPG and TD3.

Original languageEnglish
Title of host publicationSelected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics
EditorsWei Chen, Andrew Huey Ping Tan, Yang Luo, Long Huang, Yuyi Zhu, Anwar PP Abdul Majeed, Fan Zhang, Yuyao Yan, Chenguang Liu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages10-23
Number of pages14
ISBN (Print)9789819639489
DOIs
Publication statusPublished - 2025
Event2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Suzhou, China
Duration: 22 Aug 202423 Aug 2024

Publication series

NameLecture Notes in Networks and Systems
Volume1316 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024
Country/TerritoryChina
CitySuzhou
Period22/08/2423/08/24

Keywords

  • Continuous Control
  • DDPG algorithm
  • Robotics
  • TD3 algorithm

Fingerprint

Dive into the research topics of 'Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3'. Together they form a unique fingerprint.

Cite this