Deep Learning: Difference between revisions

Line 2,193: Line 2,193:


<math>\nabla_{\theta} J(\theta) \approx \frac{1}{N} \sum_{i=1}^{N} \sum_{t=1}^{T} \nabla_{\theta} \log \pi_{\theta}(a_t^{(i)} | s_t^{(i)}) \left(Q(s_t^{(i)}, a_t^{(i)} - V(s_t^{(i)})\right)</math>
<math>\nabla_{\theta} J(\theta) \approx \frac{1}{N} \sum_{i=1}^{N} \sum_{t=1}^{T} \nabla_{\theta} \log \pi_{\theta}(a_t^{(i)} | s_t^{(i)}) \left(Q(s_t^{(i)}, a_t^{(i)} - V(s_t^{(i)})\right)</math>
===Other topics in RL===
* Inverse RL
* Multi-agent RL
* Model-based RL
==Summary of Course==
;What we covered
* Supervised DL
* Unsupervised DL (GANs, VAEs)
* Self-supervised DL
* Meta-Learning
* Learning with Attention (Transformers)
* Deep RL
* Optimization
* Generalization
* Robustness
* Interpretability
;What we didn't cover
* Fairness
* Privacy & Ethics
* Bayesian DL
* Federated Learning
* Graph NNs
;Things which may be on the final
* Transformers
* Wasserstein distance
* Kernel methods


==Misc==
==Misc==