
research note
Global Optimality for Constrained Exploration via Penalty Regularization
This paper addresses the challenging problem of constrained maximum-entropy exploration in reinforcement learning (RL), where an agent seeks a policy that maximizes the entropy of the induced state…










