GEMs-LLM: Integrating Large Language Models withGoal-Aware Exploration for RL-based Portfolio Optimization

Yining Wang, Zhixiang Lu, Pin Qian, Jionglong Su, Mian Zhou, Chong Li, Zhengyong Jiang*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

We introduce GEMs-LLM, a novel reinforcement learning framework for portfolio optimization that integrates Goal-aware Exploration and Multi-level Supervision (GEMs) with a large language model (DeepSeek-V3). Existing reinforcement learning approaches often suffer from high-dimensional state spaces, sparse rewards, and instability in financial environments. GEMs-LLM addresses these issues via a hierarchical structure: a high-level controller generates portfolio-level goals using both historical and synthetic future market data, while a lowlevel agent learns to execute these goals via multi-level policy supervision. To further align the agent's behavior with human trading intuition, DeepSeek-V3 is employed to simulate expert-like reasoning and refine decision outputs. GEMsLLM supports off-policy training and removes the need for handcrafted goals, enhancing adaptability across markets. Empirical results on both U.S. and Chinese stock markets show that GEMs-LLM significantly outperforms strong baselines including Deep Deterministic Policy Gradient (DDPG), Oracle Policy Distillation (OPD), and pure GEMs variants. In particular, GEMs-LLM achieves the best performance in annualized Sharpe ratio (ASR) and downside deviation ratio (DDR), highlighting its robustness and potential for real-world deployment
Original languageEnglish
Title of host publication2025 21st International Conference on Intelligent Computing
PublisherSpringer Nature Singapore
Pages516-527
Publication statusPublished - Jul 2025

Fingerprint

Dive into the research topics of 'GEMs-LLM: Integrating Large Language Models withGoal-Aware Exploration for RL-based Portfolio Optimization'. Together they form a unique fingerprint.

Cite this