DDPG - Infinite Lexicon - Infinite Lexicon

DDPG

Deep Deterministic Policy Gradient (DDPG) is a model-free, off-policy actor-critic algorithm designed for continuous control tasks. It combines the deterministic policy gradient with deep neural networks to learn a continuous action policy and a value function, using experience replay and target networks to stabilize learning. It was introduced by Lillicrap et al. in 2015.

The agent consists of an actor network μ(s|θμ) that outputs a specific action and a critic network

To improve stability, DDPG uses target networks with soft updates and a continuous action space bounded by

DDPG has inspired subsequent developments such as Twin Delayed DDPG (TD3) and DDPG with demonstrations (DDPGfD),

a

y

=

r

+

γ

μ′

Q

∇θμ

μ(s|θμ)|θQ).

a

Ornstein–Uhlenbeck

a

hyperparameters

continuous-action