Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3

Dexuan Li; Nanlin Jin

doi:10.1007/978-981-96-3949-6_2

Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3

Dexuan Li^*, Nanlin Jin

^*Corresponding author for this work

Department of Computing

Xi'an Jiaotong-Liverpool University

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Reinforcement Learning (RL) is becoming popular for two-legged robots to learn and improve, through trial and error to adjust their actions based on feedback. Deep RL, combining RL with deep learning, handles high-dimensional states and action spaces in robotics. Two important Deep RL algorithms are Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). Our proposed new algorithm that improves TD3 involves to maximize cumulative rewards by interacting with the environment. The robot learns a policy through exploration and exploitation. This paper investigates the scenario of robots continuously walk without falling. Our experiments show that the two legged robot can automatously adapt to the environment by themselves, without human solving the problems for them. DDPG shows promise but suffers from instability and hyperparameter sensitivity. Our improved TD3 mitigates DDPG’s overestimation bias, improving stability and performance. This study also evaluates stability, convergence, and computational efficiency of DDPG and TD3.

Original language	English
Title of host publication	Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics
Editors	Wei Chen, Andrew Huey Ping Tan, Yang Luo, Long Huang, Yuyi Zhu, Anwar PP Abdul Majeed, Fan Zhang, Yuyao Yan, Chenguang Liu
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	10-23
Number of pages	14
ISBN (Print)	9789819639489
DOIs	https://doi.org/10.1007/978-981-96-3949-6_2
Publication status	Published - 2025
Event	2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Suzhou, China Duration: 22 Aug 2024 → 23 Aug 2024

Publication series

Name	Lecture Notes in Networks and Systems
Volume	1316 LNNS
ISSN (Print)	2367-3370
ISSN (Electronic)	2367-3389

Conference

Conference	2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024
Country/Territory	China
City	Suzhou
Period	22/08/24 → 23/08/24

Keywords

Continuous Control
DDPG algorithm
Robotics
TD3 algorithm

Access to Document

10.1007/978-981-96-3949-6_2

Cite this

Li, D., & Jin, N. (2025). Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3. In W. Chen, A. H. Ping Tan, Y. Luo, L. Huang, Y. Zhu, A. PP Abdul Majeed, F. Zhang, Y. Yan, & C. Liu (Eds.), Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics (pp. 10-23). (Lecture Notes in Networks and Systems; Vol. 1316 LNNS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-96-3949-6_2

Li, Dexuan ; Jin, Nanlin. / Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3. Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics. editor / Wei Chen ; Andrew Huey Ping Tan ; Yang Luo ; Long Huang ; Yuyi Zhu ; Anwar PP Abdul Majeed ; Fan Zhang ; Yuyao Yan ; Chenguang Liu. Springer Science and Business Media Deutschland GmbH, 2025. pp. 10-23 (Lecture Notes in Networks and Systems).

@inproceedings{2789d0fab79040d3acbfaf0161c92bd7,

title = "Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3",

abstract = "Reinforcement Learning (RL) is becoming popular for two-legged robots to learn and improve, through trial and error to adjust their actions based on feedback. Deep RL, combining RL with deep learning, handles high-dimensional states and action spaces in robotics. Two important Deep RL algorithms are Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). Our proposed new algorithm that improves TD3 involves to maximize cumulative rewards by interacting with the environment. The robot learns a policy through exploration and exploitation. This paper investigates the scenario of robots continuously walk without falling. Our experiments show that the two legged robot can automatously adapt to the environment by themselves, without human solving the problems for them. DDPG shows promise but suffers from instability and hyperparameter sensitivity. Our improved TD3 mitigates DDPG{\textquoteright}s overestimation bias, improving stability and performance. This study also evaluates stability, convergence, and computational efficiency of DDPG and TD3.",

keywords = "Continuous Control, DDPG algorithm, Robotics, TD3 algorithm",

author = "Dexuan Li and Nanlin Jin",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.; 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 ; Conference date: 22-08-2024 Through 23-08-2024",

year = "2025",

doi = "10.1007/978-981-96-3949-6_2",

language = "English",

isbn = "9789819639489",

series = "Lecture Notes in Networks and Systems",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "10--23",

editor = "Wei Chen and {Ping Tan}, {Andrew Huey} and Yang Luo and Long Huang and Yuyi Zhu and {PP Abdul Majeed}, Anwar and Fan Zhang and Yuyao Yan and Chenguang Liu",

booktitle = "Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics",

}

Li, D & Jin, N 2025, Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3. in W Chen, AH Ping Tan, Y Luo, L Huang, Y Zhu, A PP Abdul Majeed, F Zhang, Y Yan & C Liu (eds), Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics. Lecture Notes in Networks and Systems, vol. 1316 LNNS, Springer Science and Business Media Deutschland GmbH, pp. 10-23, 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024, Suzhou, China, 22/08/24. https://doi.org/10.1007/978-981-96-3949-6_2

Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3. / Li, Dexuan; Jin, Nanlin.
Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics. ed. / Wei Chen; Andrew Huey Ping Tan; Yang Luo; Long Huang; Yuyi Zhu; Anwar PP Abdul Majeed; Fan Zhang; Yuyao Yan; Chenguang Liu. Springer Science and Business Media Deutschland GmbH, 2025. p. 10-23 (Lecture Notes in Networks and Systems; Vol. 1316 LNNS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3

AU - Li, Dexuan

AU - Jin, Nanlin

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

PY - 2025

Y1 - 2025

N2 - Reinforcement Learning (RL) is becoming popular for two-legged robots to learn and improve, through trial and error to adjust their actions based on feedback. Deep RL, combining RL with deep learning, handles high-dimensional states and action spaces in robotics. Two important Deep RL algorithms are Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). Our proposed new algorithm that improves TD3 involves to maximize cumulative rewards by interacting with the environment. The robot learns a policy through exploration and exploitation. This paper investigates the scenario of robots continuously walk without falling. Our experiments show that the two legged robot can automatously adapt to the environment by themselves, without human solving the problems for them. DDPG shows promise but suffers from instability and hyperparameter sensitivity. Our improved TD3 mitigates DDPG’s overestimation bias, improving stability and performance. This study also evaluates stability, convergence, and computational efficiency of DDPG and TD3.

AB - Reinforcement Learning (RL) is becoming popular for two-legged robots to learn and improve, through trial and error to adjust their actions based on feedback. Deep RL, combining RL with deep learning, handles high-dimensional states and action spaces in robotics. Two important Deep RL algorithms are Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3). Our proposed new algorithm that improves TD3 involves to maximize cumulative rewards by interacting with the environment. The robot learns a policy through exploration and exploitation. This paper investigates the scenario of robots continuously walk without falling. Our experiments show that the two legged robot can automatously adapt to the environment by themselves, without human solving the problems for them. DDPG shows promise but suffers from instability and hyperparameter sensitivity. Our improved TD3 mitigates DDPG’s overestimation bias, improving stability and performance. This study also evaluates stability, convergence, and computational efficiency of DDPG and TD3.

KW - Continuous Control

KW - DDPG algorithm

KW - Robotics

KW - TD3 algorithm

UR - http://www.scopus.com/inward/record.url?scp=105002720965&partnerID=8YFLogxK

U2 - 10.1007/978-981-96-3949-6_2

DO - 10.1007/978-981-96-3949-6_2

M3 - Conference Proceeding

AN - SCOPUS:105002720965

SN - 9789819639489

T3 - Lecture Notes in Networks and Systems

SP - 10

EP - 23

BT - Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics

A2 - Chen, Wei

A2 - Ping Tan, Andrew Huey

A2 - Luo, Yang

A2 - Huang, Long

A2 - Zhu, Yuyi

A2 - PP Abdul Majeed, Anwar

A2 - Zhang, Fan

A2 - Yan, Yuyao

A2 - Liu, Chenguang

PB - Springer Science and Business Media Deutschland GmbH

T2 - 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024

Y2 - 22 August 2024 through 23 August 2024

ER -

Li D, Jin N. Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3. In Chen W, Ping Tan AH, Luo Y, Huang L, Zhu Y, PP Abdul Majeed A, Zhang F, Yan Y, Liu C, editors, Selected Proceedings from the 2nd International Conference on Intelligent Manufacturing and Robotics, ICIMR 2024 - Advances in Intelligent Manufacturing and Robotics. Springer Science and Business Media Deutschland GmbH. 2025. p. 10-23. (Lecture Notes in Networks and Systems). doi: 10.1007/978-981-96-3949-6_2

Reinforcement Learning Algorithm for Two-Leg Robot with DDPG and TD3

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this