
research note
Reinforcement Learning for Exponential Utility: Algorithms and Convergence in Discounted MDPs
This paper addresses a foundational gap in risk-sensitive reinforcement learning: the absence of principled, model-free, value-based (Q-learning-style) algorithms for optimizing exponential utility…










