Preference-based Pure Exp
Preference-based Pure Exploration
Preference-based Pure Exploration
arXiv:2412.02988v1 Announce Type: cross
Abstract: We study the preference-based pure exploration problem for bandits with vector-valued rewards. The rewards are ordered using a (given) preference cone $mathcal{C}$ and our the goal is to identify the set of Pareto optimal arms. First, to quantify th…