Online reinforcement learning for condition-based group maintenance using factored Markov decision processes

Jianyu Xu; Bin Liu; Xiujie Zhao; Xiao Lin Wang

doi:10.1016/j.ejor.2023.11.039

Online reinforcement learning for condition-based group maintenance using factored Markov decision processes

Jianyu Xu, Bin Liu^*, Xiujie Zhao, Xiao Lin Wang

^*Corresponding author for this work

Department of Intelligent Operations and Marketing

Research output: Contribution to journal › Article › peer-review

8 Citations (Scopus)

Abstract

We investigate a condition-based group maintenance problem for multi-component systems, where the degradation process of a specific component is affected only by its neighbouring ones, leading to a special type of stochastic dependence among components. We formulate the maintenance problem into a factored Markov decision process taking advantage of this dependence property, and develop a factored value iteration algorithm to efficiently approximate the optimal policy. Through both theoretical analyses and numerical experiments, we show that the algorithm can significantly reduce computational burden and improve efficiency in solving the optimization problem. Moreover, since model parameters are unknown a priori in most practical scenarios, we further develop an online reinforcement learning algorithm to simultaneously learn the model parameters and determine an optimal maintenance action upon each inspection. A novel feature of this online learning algorithm is that it is capable of learning both transition probabilities and system structure indicating the stochastic dependence among components. We discuss the error bound and sample complexity of the developed learning algorithm theoretically, and test its performance through numerical experiments. The results reveal that our algorithm can effectively learn the model parameters and approximate the optimal maintenance policy.

Original language	English
Pages (from-to)	176-190
Number of pages	15
Journal	European Journal of Operational Research
Volume	315
Issue number	1
DOIs	https://doi.org/10.1016/j.ejor.2023.11.039
Publication status	Published - 16 May 2024

Keywords

Condition-based group maintenance
Factored Markov decision process
Factored value iteration
Maintenance
Online reinforcement learning

Access to Document

10.1016/j.ejor.2023.11.039

Cite this

@article{c143b78c8a594e278de6e990e29f6f8e,

title = "Online reinforcement learning for condition-based group maintenance using factored Markov decision processes",

abstract = "We investigate a condition-based group maintenance problem for multi-component systems, where the degradation process of a specific component is affected only by its neighbouring ones, leading to a special type of stochastic dependence among components. We formulate the maintenance problem into a factored Markov decision process taking advantage of this dependence property, and develop a factored value iteration algorithm to efficiently approximate the optimal policy. Through both theoretical analyses and numerical experiments, we show that the algorithm can significantly reduce computational burden and improve efficiency in solving the optimization problem. Moreover, since model parameters are unknown a priori in most practical scenarios, we further develop an online reinforcement learning algorithm to simultaneously learn the model parameters and determine an optimal maintenance action upon each inspection. A novel feature of this online learning algorithm is that it is capable of learning both transition probabilities and system structure indicating the stochastic dependence among components. We discuss the error bound and sample complexity of the developed learning algorithm theoretically, and test its performance through numerical experiments. The results reveal that our algorithm can effectively learn the model parameters and approximate the optimal maintenance policy.",

keywords = "Condition-based group maintenance, Factored Markov decision process, Factored value iteration, Maintenance, Online reinforcement learning",

author = "Jianyu Xu and Bin Liu and Xiujie Zhao and Wang, {Xiao Lin}",

note = "Publisher Copyright: {\textcopyright} 2023 The Author(s)",

year = "2024",

month = may,

day = "16",

doi = "10.1016/j.ejor.2023.11.039",

language = "English",

volume = "315",

pages = "176--190",

journal = "European Journal of Operational Research",

issn = "0377-2217",

publisher = "Elsevier",

number = "1",

}

TY - JOUR

T1 - Online reinforcement learning for condition-based group maintenance using factored Markov decision processes

AU - Xu, Jianyu

AU - Liu, Bin

AU - Zhao, Xiujie

AU - Wang, Xiao Lin

PY - 2024/5/16

Y1 - 2024/5/16

N2 - We investigate a condition-based group maintenance problem for multi-component systems, where the degradation process of a specific component is affected only by its neighbouring ones, leading to a special type of stochastic dependence among components. We formulate the maintenance problem into a factored Markov decision process taking advantage of this dependence property, and develop a factored value iteration algorithm to efficiently approximate the optimal policy. Through both theoretical analyses and numerical experiments, we show that the algorithm can significantly reduce computational burden and improve efficiency in solving the optimization problem. Moreover, since model parameters are unknown a priori in most practical scenarios, we further develop an online reinforcement learning algorithm to simultaneously learn the model parameters and determine an optimal maintenance action upon each inspection. A novel feature of this online learning algorithm is that it is capable of learning both transition probabilities and system structure indicating the stochastic dependence among components. We discuss the error bound and sample complexity of the developed learning algorithm theoretically, and test its performance through numerical experiments. The results reveal that our algorithm can effectively learn the model parameters and approximate the optimal maintenance policy.

AB - We investigate a condition-based group maintenance problem for multi-component systems, where the degradation process of a specific component is affected only by its neighbouring ones, leading to a special type of stochastic dependence among components. We formulate the maintenance problem into a factored Markov decision process taking advantage of this dependence property, and develop a factored value iteration algorithm to efficiently approximate the optimal policy. Through both theoretical analyses and numerical experiments, we show that the algorithm can significantly reduce computational burden and improve efficiency in solving the optimization problem. Moreover, since model parameters are unknown a priori in most practical scenarios, we further develop an online reinforcement learning algorithm to simultaneously learn the model parameters and determine an optimal maintenance action upon each inspection. A novel feature of this online learning algorithm is that it is capable of learning both transition probabilities and system structure indicating the stochastic dependence among components. We discuss the error bound and sample complexity of the developed learning algorithm theoretically, and test its performance through numerical experiments. The results reveal that our algorithm can effectively learn the model parameters and approximate the optimal maintenance policy.

KW - Condition-based group maintenance

KW - Factored Markov decision process

KW - Factored value iteration

KW - Maintenance

KW - Online reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85178145416&partnerID=8YFLogxK

U2 - 10.1016/j.ejor.2023.11.039

DO - 10.1016/j.ejor.2023.11.039

M3 - Article

AN - SCOPUS:85178145416

SN - 0377-2217

VL - 315

SP - 176

EP - 190

JO - European Journal of Operational Research

JF - European Journal of Operational Research

IS - 1

ER -

Online reinforcement learning for condition-based group maintenance using factored Markov decision processes

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this