TY - JOUR
T1 - Autonomous Input Voltage Sharing Control and Triple Phase Shift Modulation Method for ISOP-DAB Converter in DC Microgrid
T2 - A Multiagent Deep Reinforcement Learning-Based Method
AU - Zeng, Yu
AU - Pou, Josep
AU - Sun, Changjiang
AU - Mukherjee, Suvajit
AU - Xu, Xu
AU - Gupta, Amit Kumar
AU - Dong, Jiaxin
N1 - Publisher Copyright:
© 1986-2012 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - This article proposes a multiagent (MA) deep reinforcement learning (DRL) based autonomous input voltage sharing (IVS) control and triple phase shift modulation method for input-series output-parallel (ISOP) dual active bridge (DAB) converters to solve the three challenges: the uncertainties of the dc microgrid, the power balance problem, and the current stress minimization of the converter. Specifically, the control and modulation problem of the ISOP-DAB converter is formed as a Markov game with several DRL agents. Subsequently, the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm is applied to train the DRL agents in an offline manner. After the training process, the multiple agents can provide online control decisions for the ISOP-DAB converter to balance the IVS, and minimize the current stress among different submodules. Without accurate model information, the proposed method can adaptively obtain the optimal modulation variable combinations in a stochastic and uncertain environment. Simulation and experimental results verify the effectiveness of the proposed MA-TD3-based algorithm.
AB - This article proposes a multiagent (MA) deep reinforcement learning (DRL) based autonomous input voltage sharing (IVS) control and triple phase shift modulation method for input-series output-parallel (ISOP) dual active bridge (DAB) converters to solve the three challenges: the uncertainties of the dc microgrid, the power balance problem, and the current stress minimization of the converter. Specifically, the control and modulation problem of the ISOP-DAB converter is formed as a Markov game with several DRL agents. Subsequently, the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm is applied to train the DRL agents in an offline manner. After the training process, the multiple agents can provide online control decisions for the ISOP-DAB converter to balance the IVS, and minimize the current stress among different submodules. Without accurate model information, the proposed method can adaptively obtain the optimal modulation variable combinations in a stochastic and uncertain environment. Simulation and experimental results verify the effectiveness of the proposed MA-TD3-based algorithm.
KW - Input-series output-parallel-connected dual active bridge (ISOP-DAB) converter
KW - input voltage sharing (IVS)
KW - multiagent twin-delayed deep deterministic policy gradient (MA-TD3)
KW - triple phase shift modulation
UR - http://www.scopus.com/inward/record.url?scp=85141581566&partnerID=8YFLogxK
U2 - 10.1109/TPEL.2022.3218900
DO - 10.1109/TPEL.2022.3218900
M3 - Article
AN - SCOPUS:85141581566
SN - 0885-8993
VL - 38
SP - 2985
EP - 3000
JO - IEEE Transactions on Power Electronics
JF - IEEE Transactions on Power Electronics
IS - 3
ER -