Abstract
This paper introduces a novel hybrid portfolio optimization framework that integrates deep reinforcement learning (DRL) with the Black-Litterman (BL) model under elliptical distributions. Traditional portfolio optimization methods often fail to adapt to non-stationary environments with high volatility and heavytailed asset return distributions, while standard DRL implementations struggle to effectively model complex statistical properties of financial returns. Our framework, BLED (Black-Litterman under Elliptical Distributions), utilizes the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm with transformerbased architectures for view generation and CNN-based risk aversion estimation to address these limitations. We incorporate elliptical distributions to more accurately model heavy-tailed asset returns, substantially improving risk assessment and portfolio allocation decisions. Empirical evaluation on Dow Jones Industrial Average constituent stocks demonstrates that our approach significantly outperforms both traditional portfolio optimization strategies and state-of-the-art DRL methods, achieving 76.1% accumulated returns compared to around 44% for the best baseline models. Risk-adjusted performance metrics show even more pronounced advantages, with Sharpe and Sortino ratios approximately 46% higher than top-performing baselines. Statistical analysis using the non-parametric Kruskal-Wallis test across multiple time periods confirms the significance of our performance improvements (p=0.030 for returns), demonstrating that BLED's advantages are robust across varying market conditions.
Original language | English |
---|---|
Title of host publication | 2025 21st International Conference on Intelligent Computing |
Publication status | Accepted/In press - 2025 |