Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

Keqin Liu; Qing Zhao

doi:10.1109/TIT.2010.2068950

Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

Keqin Liu^*, Qing Zhao

^*Corresponding author for this work

Department of Financial and Actuarial Mathematics

Cornell University

Research output: Contribution to journal › Article › peer-review

271 Citations (Scopus)

Abstract

In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.

Original language	English
Article number	5605371
Pages (from-to)	5547-5567
Number of pages	21
Journal	IEEE Transactions on Information Theory
Volume	56
Issue number	11
DOIs	https://doi.org/10.1109/TIT.2010.2068950
Publication status	Published - Nov 2010

Keywords

Dynamic channel selection
indexability
myopic policy
opportunistic access
restless multiarmed bandit (RMAB)
Whittle index

Access to Document

10.1109/TIT.2010.2068950

Cite this

@article{fb4ca4b2fb9941f9a0cf5c54b34b5803,

title = "Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access",

abstract = "In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.",

keywords = "Dynamic channel selection, indexability, myopic policy, opportunistic access, restless multiarmed bandit (RMAB), Whittle index",

author = "Keqin Liu and Qing Zhao",

year = "2010",

month = nov,

doi = "10.1109/TIT.2010.2068950",

language = "English",

volume = "56",

pages = "5547--5567",

journal = "IEEE Transactions on Information Theory",

issn = "0018-9448",

number = "11",

}

TY - JOUR

T1 - Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

AU - Liu, Keqin

AU - Zhao, Qing

PY - 2010/11

Y1 - 2010/11

N2 - In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.

AB - In this paper, we consider a class of restless multiarmed bandit processes (RMABs) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMABs, we establish the indexability and obtain Whittle index in closed form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalence between Whittle index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle index policy are illustrated with simulation examples.

KW - Dynamic channel selection

KW - indexability

KW - myopic policy

KW - opportunistic access

KW - restless multiarmed bandit (RMAB)

KW - Whittle index

UR - http://www.scopus.com/inward/record.url?scp=77958597180&partnerID=8YFLogxK

U2 - 10.1109/TIT.2010.2068950

DO - 10.1109/TIT.2010.2068950

M3 - Article

AN - SCOPUS:77958597180

SN - 0018-9448

VL - 56

SP - 5547

EP - 5567

JO - IEEE Transactions on Information Theory

JF - IEEE Transactions on Information Theory

IS - 11

M1 - 5605371

ER -

Indexability of restless bandit problems and optimality of Whittle index for dynamic multichannel access

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this