IRDA: Incremental Reinforcement Learning for Dynamic Resource Allocation

Jia Wang*, Jiannong Cao, Senzhang Wang, Zhongyu Yao, Wengen Li

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Resource allocation problems often manifest as online decision-making tasks where the proper allocation strategy depends on the understanding of the allocation environment and resources workload. Most existing resource allocation methods are based on meticulously designed heuristics which ignore the patterns of incoming tasks, so the dynamics of incoming tasks cannot be properly handled. To address this problem, we mine the task patterns from the large volume of historical allocation data and propose a reinforcement learning model termed IRDA to learn the allocation strategy in an incremental way. We observe that historical allocation data is usually generated from the daily repeated operations, which is not independent and identically distributed. Training with partial of this dataset can make the allocation strategy converged already, thereby wasting a lot of remaining data. To improve the learning efficiency, we partition the whole historical allocation big dataset into multi-batch datasets, which forces the agent to continuously 'explore' and learn on the distinct state spaces. IRDA reuses the strategy learned from the previous batch dataset and adapts it to the learning on the next batch dataset, so as to incrementally learn from multi-batch datasets and improve the allocation strategy. We apply the proposed method to handle baggage carousel allocation at Hong Kong International Airport (HKIA). The experimental results show that IRDA is capable of incrementally learning from multi-batch datasets, and improves the baggage carousel resource utilization by around 51.86 percent compared to the current baggage carousel allocation system at HKIA.

Original languageEnglish
Pages (from-to)770-783
Number of pages14
JournalIEEE Transactions on Big Data
Volume8
Issue number3
DOIs
Publication statusPublished - 1 Jun 2022
Externally publishedYes

Keywords

  • Resource allocation
  • airport resource management
  • baggage handling
  • reinforcement learning

Fingerprint

Dive into the research topics of 'IRDA: Incremental Reinforcement Learning for Dynamic Resource Allocation'. Together they form a unique fingerprint.

Cite this