Log Fire

Preference-based Pure Ex

Preference-based Pure Exp
Preference-based Pure Exploration

arXiv:2412.02988v1 Announce Type: cross
Abstract: We study the preference-based pure exploration problem for bandits with vector-valued rewards. The rewards are ordered using a (given) preference cone $mathcal{C}$ and our the goal is to identify the set of Pareto optimal arms. First, to quantify th…

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *