Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems

Standard

Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems. / Dorozhko, Anton.

SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 862-867 8958202 (SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Dorozhko, A 2019, Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems. in SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings., 8958202, SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 862-867, 2019 International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2019, Novosibirsk, Russian Federation, 21.10.2019. https://doi.org/10.1109/SIBIRCON48586.2019.8958202

APA

Dorozhko, A. (2019). Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems. In SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings (pp. 862-867). [8958202] (SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SIBIRCON48586.2019.8958202

Vancouver

Dorozhko A. Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems. In SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 862-867. 8958202. (SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings). doi: 10.1109/SIBIRCON48586.2019.8958202

Author

Dorozhko, Anton. / Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems. SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 862-867 (SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings).

BibTeX

@inproceedings{ff1e8b1f911e4734a7fec9dc6775fce6,

title = "Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems",

abstract = "Recommender systems help users to orient in the vast space of goods, services, and events. A user interacts with the recommender engine in a sequence of exchanges of recommendations and user feedback. The idea that previous interaction influence the later ones and the importance of the sequence of interactions can be modeled using Markov decision processes and solved by reinforcement learning. Several recent articles applying reinforcement learning to recommender systems have proved the viability of this direction. But it is still difficult to compare different approaches. We propose an environment with a unified interface that will permit to compare different modelization of recommender process and different algorithms on the same underlying sequential data. We also performed the extensive parameter study for deep deterministic policy gradient methods on the well-known MovieLens dataset.",

keywords = "DDPG, deep reinforcement learning (DRL), long-Term value, recommender systems, reinforcement learning",

author = "Anton Dorozhko",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.; 2019 International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2019 ; Conference date: 21-10-2019 Through 27-10-2019",

year = "2019",

month = oct,

doi = "10.1109/SIBIRCON48586.2019.8958202",

language = "English",

isbn = "978-1-7281-4402-3",

series = "SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "862--867",

booktitle = "SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings",

address = "United States",

}

RIS

TY - GEN

T1 - Reinforcement Learning for Long-Term Reward Optimization in Recommender Systems

AU - Dorozhko, Anton

PY - 2019/10

Y1 - 2019/10

N2 - Recommender systems help users to orient in the vast space of goods, services, and events. A user interacts with the recommender engine in a sequence of exchanges of recommendations and user feedback. The idea that previous interaction influence the later ones and the importance of the sequence of interactions can be modeled using Markov decision processes and solved by reinforcement learning. Several recent articles applying reinforcement learning to recommender systems have proved the viability of this direction. But it is still difficult to compare different approaches. We propose an environment with a unified interface that will permit to compare different modelization of recommender process and different algorithms on the same underlying sequential data. We also performed the extensive parameter study for deep deterministic policy gradient methods on the well-known MovieLens dataset.

AB - Recommender systems help users to orient in the vast space of goods, services, and events. A user interacts with the recommender engine in a sequence of exchanges of recommendations and user feedback. The idea that previous interaction influence the later ones and the importance of the sequence of interactions can be modeled using Markov decision processes and solved by reinforcement learning. Several recent articles applying reinforcement learning to recommender systems have proved the viability of this direction. But it is still difficult to compare different approaches. We propose an environment with a unified interface that will permit to compare different modelization of recommender process and different algorithms on the same underlying sequential data. We also performed the extensive parameter study for deep deterministic policy gradient methods on the well-known MovieLens dataset.

KW - DDPG

KW - deep reinforcement learning (DRL)

KW - long-Term value

KW - recommender systems

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85079055019&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/5da60ad4-4ab1-377e-b14f-e91062fdc017/

U2 - 10.1109/SIBIRCON48586.2019.8958202

DO - 10.1109/SIBIRCON48586.2019.8958202

M3 - Conference contribution

AN - SCOPUS:85079055019

SN - 978-1-7281-4402-3

T3 - SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings

SP - 862

EP - 867

BT - SIBIRCON 2019 - International Multi-Conference on Engineering, Computer and Information Sciences, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2019 International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2019

Y2 - 21 October 2019 through 27 October 2019

ER -

ID: 28278727