Evaluating and Selecting Deep Reinforcement Learning Models for OptimalDynamic Pricing: A Systematic Comparison of PPO, DDPG, and SAC

Yuchen Liu, Ka Lok Man*, Gangmin Li, Terry R. Payne, Yong Yue

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Given the plethora of available solutions, choosing an appropriate Deep Reinforcement Learning (DRL) model for dynamic pricing poses a significant challenge for practitioners. While many DRL solutions claim superior performance, there lacks a standardized framework for their evaluation. Addressing this gap, we introduce a novel framework and a set of metrics to select and assess DRL models systematically. To validate the utility of our framework, we critically compared three representative DRL models, emphasizing their performance in dynamic pricing tasks. Further ensuring the robustness of our assessment, we benchmarked these models against a well-established human agent policy. The DRL model that emerged as the most effective was rigorously tested on an Amazon dataset, demonstrating a notable performance boost of 5.64%. Our findings underscore the value of our proposed metrics and framework in guiding practitioners towards the most suitable DRL solution for dynamic pricing.

Original languageEnglish
Title of host publicationProceedings - 2024 8th International Conference on Control Engineering and Artificial Intelligence, CCEAI 2024
EditorsWenqiang Zhang, Yong Yue, Marek Ogiela
PublisherAssociation for Computing Machinery
Pages215-219
Number of pages5
ISBN (Electronic)9798400707971
DOIs
Publication statusPublished - 26 Jan 2024
Event8th International Conference on Control Engineering and Artificial Intelligence, CCEAI 2024 - Shanghai, China
Duration: 26 Jan 202428 Jan 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference8th International Conference on Control Engineering and Artificial Intelligence, CCEAI 2024
Country/TerritoryChina
CityShanghai
Period26/01/2428/01/24

Keywords

  • DDPG (Deep Deterministic Policy Gradient)
  • Deep Reinforcement Learning (DRL)
  • Dynamic Pricing
  • E-commerce
  • Inventory Management
  • Markov Decision Process
  • Model Evaluation
  • PPO (Proximal Policy Optimization)
  • Price Elasticity of Demand
  • SAC (Soft Actor-Critic)

Fingerprint

Dive into the research topics of 'Evaluating and Selecting Deep Reinforcement Learning Models for OptimalDynamic Pricing: A Systematic Comparison of PPO, DDPG, and SAC'. Together they form a unique fingerprint.

Cite this