Log Fire

Automatic Double Reinfor

Automatic Double Reinforc
Automatic Double Reinforcement Learning in Semiparametric Markov Decision Processes with Applications to Long-Term Causal Inference

arXiv:2501.06926v1 Announce Type: cross
Abstract: Double reinforcement learning (DRL) enables statistically efficient inference on the value of a policy in a nonparametric Markov Decision Process (MDP) given trajectories generated by another policy. However, this approach necessarily requires strin…

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *