Deep Hierarchy in Bandits

Hong, Joey; Kveton, Branislav; Katariya, Sumeet; Zaheer, Manzil; Ghavamzadeh, Mohammad

Computer Science > Machine Learning

arXiv:2202.01454 (cs)

[Submitted on 3 Feb 2022]

Title:Deep Hierarchy in Bandits

Authors:Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

View PDF

Abstract:Mean rewards of actions are often correlated. The form of these correlations may be complex and unknown a priori, such as the preferences of a user for recommended products and their categories. To maximize statistical efficiency, it is important to leverage these correlations when learning. We formulate a bandit variant of this problem where the correlations of mean action rewards are represented by a hierarchical Bayesian model with latent variables. Since the hierarchy can have multiple layers, we call it deep. We propose a hierarchical Thompson sampling algorithm (HierTS) for this problem, and show how to implement it efficiently for Gaussian hierarchies. The efficient implementation is possible due to a novel exact hierarchical representation of the posterior, which itself is of independent interest. We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits. Our analysis reflects the structure of the problem, that the regret decreases with the prior width, and also shows that hierarchies reduce the regret by non-constant factors in the number of actions. We confirm these theoretical findings empirically, in both synthetic and real-world experiments.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2202.01454 [cs.LG]
	(or arXiv:2202.01454v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.01454

Submission history

From: Branislav Kveton [view email]
[v1] Thu, 3 Feb 2022 08:15:53 UTC (2,390 KB)

Computer Science > Machine Learning

Title:Deep Hierarchy in Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Deep Hierarchy in Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators