Reinforcement Learning

Reinforcement learning learns policies through interaction to maximize cumulative reward using algorithms like DQN, PPO, and SAC. Key engineering issues include sample efficiency, exploration strategies, and stable value/policy updates.