An Assessment of Multistage Reward Function Design for Deep Reinforcement Learning-Based Microgrid Energy Management

Hui Hwang Goh; Yifeng Huang; Chee Shen Lim; Dongdong Zhang; Hui Liu; Wei Dai; Tonni Agustiono Kurniawan; Saifur Rahman

doi:10.1109/TSG.2022.3179567

An Assessment of Multistage Reward Function Design for Deep Reinforcement Learning-Based Microgrid Energy Management

Hui Hwang Goh^*, Yifeng Huang, Chee Shen Lim, Dongdong Zhang, Hui Liu, Wei Dai, Tonni Agustiono Kurniawan, Saifur Rahman

^*Corresponding author for this work

Department of Electrical and Electronic Engineering

Research output: Contribution to journal › Article › peer-review

48 Citations (Scopus)

Abstract

Reinforcement learning based energy management strategy has been an active research subject in the past few years. Different from the baseline reward function (BRF), the work proposes and investigates a multi-stage reward mechanism (MSRM) that scores the agent's step and final performance during training and returns it to the agent in real time as a reward. MSRM will also improve the agent's training through expert intervention which aims to prevent the agent from being trapped in sub-optimal strategies. The energy management performance considered by MSRM-based algorithm includes the energy balance, economic cost, and reliability. The reward function is assessed in conjunction with two deep reinforcement learning algorithms: double deep Q-learning network (DDQN) and policy gradient (PG). Upon benchmarking with BRF, the numerical simulation shows that MSRM tends to improve the convergence characteristic, reduce the explained variance, and reduce the tendency of the agent being trapped in suboptimal strategies. In addition, the methods have been assessed with MPC-based energy management strategies in terms of relative cost, self-balancing rate, and computational time. The assessment concludes that, in the given context, PG-MSRM has the best overall performance.

Original language	English
Pages (from-to)	4300-4311
Number of pages	12
Journal	IEEE Transactions on Smart Grid
Volume	13
Issue number	6
DOIs	https://doi.org/10.1109/TSG.2022.3179567
Publication status	Published - 1 Nov 2022

Keywords

Microgrid energy management
deep reinforcement learning
optimal scheduling
reward function

Access to Document

10.1109/TSG.2022.3179567

Cite this

@article{25700de5a3c54236abf75db1f6e53e35,

title = "An Assessment of Multistage Reward Function Design for Deep Reinforcement Learning-Based Microgrid Energy Management",

abstract = "Reinforcement learning based energy management strategy has been an active research subject in the past few years. Different from the baseline reward function (BRF), the work proposes and investigates a multi-stage reward mechanism (MSRM) that scores the agent's step and final performance during training and returns it to the agent in real time as a reward. MSRM will also improve the agent's training through expert intervention which aims to prevent the agent from being trapped in sub-optimal strategies. The energy management performance considered by MSRM-based algorithm includes the energy balance, economic cost, and reliability. The reward function is assessed in conjunction with two deep reinforcement learning algorithms: double deep Q-learning network (DDQN) and policy gradient (PG). Upon benchmarking with BRF, the numerical simulation shows that MSRM tends to improve the convergence characteristic, reduce the explained variance, and reduce the tendency of the agent being trapped in suboptimal strategies. In addition, the methods have been assessed with MPC-based energy management strategies in terms of relative cost, self-balancing rate, and computational time. The assessment concludes that, in the given context, PG-MSRM has the best overall performance.",

keywords = "Microgrid energy management, deep reinforcement learning, optimal scheduling, reward function",

author = "Goh, {Hui Hwang} and Yifeng Huang and Lim, {Chee Shen} and Dongdong Zhang and Hui Liu and Wei Dai and Kurniawan, {Tonni Agustiono} and Saifur Rahman",

note = "Publisher Copyright: {\textcopyright} 2010-2012 IEEE.",

year = "2022",

month = nov,

day = "1",

doi = "10.1109/TSG.2022.3179567",

language = "English",

volume = "13",

pages = "4300--4311",

journal = "IEEE Transactions on Smart Grid",

issn = "1949-3053",

publisher = "IEEE",

number = "6",

}

TY - JOUR

T1 - An Assessment of Multistage Reward Function Design for Deep Reinforcement Learning-Based Microgrid Energy Management

AU - Goh, Hui Hwang

AU - Huang, Yifeng

AU - Lim, Chee Shen

AU - Zhang, Dongdong

AU - Liu, Hui

AU - Dai, Wei

AU - Kurniawan, Tonni Agustiono

AU - Rahman, Saifur

PY - 2022/11/1

Y1 - 2022/11/1

N2 - Reinforcement learning based energy management strategy has been an active research subject in the past few years. Different from the baseline reward function (BRF), the work proposes and investigates a multi-stage reward mechanism (MSRM) that scores the agent's step and final performance during training and returns it to the agent in real time as a reward. MSRM will also improve the agent's training through expert intervention which aims to prevent the agent from being trapped in sub-optimal strategies. The energy management performance considered by MSRM-based algorithm includes the energy balance, economic cost, and reliability. The reward function is assessed in conjunction with two deep reinforcement learning algorithms: double deep Q-learning network (DDQN) and policy gradient (PG). Upon benchmarking with BRF, the numerical simulation shows that MSRM tends to improve the convergence characteristic, reduce the explained variance, and reduce the tendency of the agent being trapped in suboptimal strategies. In addition, the methods have been assessed with MPC-based energy management strategies in terms of relative cost, self-balancing rate, and computational time. The assessment concludes that, in the given context, PG-MSRM has the best overall performance.

AB - Reinforcement learning based energy management strategy has been an active research subject in the past few years. Different from the baseline reward function (BRF), the work proposes and investigates a multi-stage reward mechanism (MSRM) that scores the agent's step and final performance during training and returns it to the agent in real time as a reward. MSRM will also improve the agent's training through expert intervention which aims to prevent the agent from being trapped in sub-optimal strategies. The energy management performance considered by MSRM-based algorithm includes the energy balance, economic cost, and reliability. The reward function is assessed in conjunction with two deep reinforcement learning algorithms: double deep Q-learning network (DDQN) and policy gradient (PG). Upon benchmarking with BRF, the numerical simulation shows that MSRM tends to improve the convergence characteristic, reduce the explained variance, and reduce the tendency of the agent being trapped in suboptimal strategies. In addition, the methods have been assessed with MPC-based energy management strategies in terms of relative cost, self-balancing rate, and computational time. The assessment concludes that, in the given context, PG-MSRM has the best overall performance.

KW - Microgrid energy management

KW - deep reinforcement learning

KW - optimal scheduling

KW - reward function

UR - http://www.scopus.com/inward/record.url?scp=85131739183&partnerID=8YFLogxK

U2 - 10.1109/TSG.2022.3179567

DO - 10.1109/TSG.2022.3179567

M3 - Article

AN - SCOPUS:85131739183

SN - 1949-3053

VL - 13

SP - 4300

EP - 4311

JO - IEEE Transactions on Smart Grid

JF - IEEE Transactions on Smart Grid

IS - 6

ER -

An Assessment of Multistage Reward Function Design for Deep Reinforcement Learning-Based Microgrid Energy Management

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this