mlberkeley · JamesMBartlett · Sep 22, 2016 · MadcowD · Sep 22, 2016
diff --git a/docs/arxiv/empirical/main.tex b/docs/arxiv/empirical/main.tex
@@ -171,6 +171,7 @@
 
 \section{Introduction}
 \todo[inline]{Introduction to DDPG and recent advances in deep RL. }
+[INSERT OPENING SENTENCE HERE] The current state-of-the-art in deep reinforcement learning is the Deep Deterministic Policy Gradient (DDPG) algorithm [\cite{lillicrap2015ddpg}] which expanded the deterministic policy gradient algorithm [\cite{silver2014dpg}] to continuous, high dimensional action spaces, with much success. The basic idea of DDPG is to use an actor-critic algorithm based on the DPG algorithm, where the critic $Q(s, a)$ is learned as in deep Q network learning [\cite{mnih2013dqn}], which is a model-free learning regime, and the actor $\mu(s)$ is updated based on sampling the policy gradient from [\cite{silver2014dpg}]. This algorithm had success comparable to planning based solvers on many physical control problems. 
 
 \todo[inline]{Biological diffusion of dopamine in the brain$\implies$ error backpropagation is not biologically feasible.}