Jump to content

Deep Learning: Difference between revisions

Line 2,154: Line 2,154:
<math>
<math>
\begin{aligned}
\begin{aligned}
\nabla_{\theta} J(\theta) &=\\
\nabla_{\theta} J(\theta) &= E[...]\\
&\approx \frac{1}{N} \sum_{i=1}^{N}(\sum \nabla_{\theta} \log \pi(a_t^{(i)} | s_t^{(i)}) ....
&\approx \frac{1}{N} \sum_{i=1}^{N}(\sum \nabla_{\theta} \log \pi(a_t^{(i)} | s_t^{(i)}) ....
\end{aligned}
\end{aligned}