Deep Learning: Difference between revisions
Line 2,154: | Line 2,154: | ||
<math> | <math> | ||
\begin{aligned} | \begin{aligned} | ||
\nabla_{\theta} J(\theta) &=\\ | \nabla_{\theta} J(\theta) &= E[...]\\ | ||
&\approx \frac{1}{N} \sum_{i=1}^{N}(\sum \nabla_{\theta} \log \pi(a_t^{(i)} | s_t^{(i)}) .... | &\approx \frac{1}{N} \sum_{i=1}^{N}(\sum \nabla_{\theta} \log \pi(a_t^{(i)} | s_t^{(i)}) .... | ||
\end{aligned} | \end{aligned} |