Standard

Gittins index for simple family of markov bandit processes with switching cost and no discounting. / Savelov, M. P.

In: Theory of Probability and its Applications, Vol. 64, No. 3, 01.01.2019, p. 355-364.

Research output: Contribution to journalArticlepeer-review

Harvard

Savelov, MP 2019, 'Gittins index for simple family of markov bandit processes with switching cost and no discounting', Theory of Probability and its Applications, vol. 64, no. 3, pp. 355-364. https://doi.org/10.1137/S0040585X97T989544

APA

Vancouver

Savelov MP. Gittins index for simple family of markov bandit processes with switching cost and no discounting. Theory of Probability and its Applications. 2019 Jan 1;64(3):355-364. doi: 10.1137/S0040585X97T989544

Author

Savelov, M. P. / Gittins index for simple family of markov bandit processes with switching cost and no discounting. In: Theory of Probability and its Applications. 2019 ; Vol. 64, No. 3. pp. 355-364.

BibTeX

@article{ef22b038c177476ea3ab21e7e627c554,
title = "Gittins index for simple family of markov bandit processes with switching cost and no discounting",
abstract = "We consider the multiarmed bandit problem (the problem of Markov bandits) with switching penalties and no discounting in case when state spaces of all bandits are finite. An optimal strategy should have the largest average reward per unit time on an infinite time horizon. For this problem it is shown that an optimal strategy can be specified by a Gittins index under the natural assumption that the switching penalties are nonnegative.",
keywords = "Controlled Markov processes, Gittins index, Long run average return, Markov decision process, Multiarmed bandit problem, Multicomponent systems, No discounting, Optimal strategy, Simple family of alternative Markov bandit processes, Switching penalties",
author = "Savelov, {M. P.}",
year = "2019",
month = jan,
day = "1",
doi = "10.1137/S0040585X97T989544",
language = "English",
volume = "64",
pages = "355--364",
journal = "Theory of Probability and its Applications",
issn = "0040-585X",
publisher = "SIAM PUBLICATIONS",
number = "3",

}

RIS

TY - JOUR

T1 - Gittins index for simple family of markov bandit processes with switching cost and no discounting

AU - Savelov, M. P.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - We consider the multiarmed bandit problem (the problem of Markov bandits) with switching penalties and no discounting in case when state spaces of all bandits are finite. An optimal strategy should have the largest average reward per unit time on an infinite time horizon. For this problem it is shown that an optimal strategy can be specified by a Gittins index under the natural assumption that the switching penalties are nonnegative.

AB - We consider the multiarmed bandit problem (the problem of Markov bandits) with switching penalties and no discounting in case when state spaces of all bandits are finite. An optimal strategy should have the largest average reward per unit time on an infinite time horizon. For this problem it is shown that an optimal strategy can be specified by a Gittins index under the natural assumption that the switching penalties are nonnegative.

KW - Controlled Markov processes

KW - Gittins index

KW - Long run average return

KW - Markov decision process

KW - Multiarmed bandit problem

KW - Multicomponent systems

KW - No discounting

KW - Optimal strategy

KW - Simple family of alternative Markov bandit processes

KW - Switching penalties

UR - http://www.scopus.com/inward/record.url?scp=85074360690&partnerID=8YFLogxK

U2 - 10.1137/S0040585X97T989544

DO - 10.1137/S0040585X97T989544

M3 - Article

AN - SCOPUS:85074360690

VL - 64

SP - 355

EP - 364

JO - Theory of Probability and its Applications

JF - Theory of Probability and its Applications

SN - 0040-585X

IS - 3

ER -

ID: 22362425