Abstract
A restless multi-armed bandit problem that arises in multichannel opportunistic communications is considered, where channels are modeled as independent and identical GilbertElliot channels and channel state detection is subject to errors. A simple structure of the myopic policy is established under a certain condition on the false alarm probability of the channel state detector. It is shown that myopic actions can be obtained by maintaining a simple channel ordering without knowing the underlying Markovian model. The optimality of the myopic policy is proved for the case of two channels and conjectured for general cases. Lower and upper bounds on the performance of the myopic policy are obtained in closed-form, which characterize the scaling behavior of the achievable throughput of the multichannel opportunistic system. The approximation factor of the myopic policy is also analyzed to bound its worst-case performance loss with respect to the optimal performance.
Original language | English |
---|---|
Article number | 5398950 |
Pages (from-to) | 2795-2808 |
Number of pages | 14 |
Journal | IEEE Transactions on Signal Processing |
Volume | 58 |
Issue number | 5 |
DOIs | |
Publication status | Published - May 2010 |
Keywords
- Cognitive radio
- Dynamic multichannel access
- Myopic policy
- Restless multi-armed bandit