Bayesian adversarial multi-node bandit for optimal smart grid protection against cyber attacks

Jianyu Xu, Bin Liu, Huadong Mo*, Daoyi Dong

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Citations (Scopus)


The cyber security of smart grids has become one of key problems in developing reliable modern power and energy systems. This paper introduces a non-stationary adversarial cost with a variation constraint for smart grids and enables us to investigate the problem of optimal smart grid protection against cyber attacks in a relatively practical scenario. In particular, a Bayesian multi-node bandit (MNB) model with adversarial costs is constructed and a new regret function is defined for this model. An algorithm called Thompson–Hedge algorithm is presented to solve the problem and the superior performance of the proposed algorithm is proven in terms of the convergence rate of the regret function. The applicability of the algorithm to real smart grid scenarios is verified and the performance of the algorithm is also demonstrated by numerical examples.

Original languageEnglish
Article number109551
Publication statusPublished - Jun 2021


  • Bayesian updating
  • Cyber attack
  • Multi-node bandit
  • Reinforcement learning
  • Smart grid

Cite this