Numerical Optimization: Difference between revisions
No edit summary |
|||
Line 39: | Line 39: | ||
\end{cases} | \end{cases} | ||
</math> | </math> | ||
==Conjugate Gradient Methods== | |||
==Resources== | ==Resources== | ||
* [https://link.springer.com/book/10.1007%2F978-0-387-40065-5 Numerical Optimization by Nocedal and Wright (2006)] | * [https://link.springer.com/book/10.1007%2F978-0-387-40065-5 Numerical Optimization by Nocedal and Wright (2006)] |
Revision as of 02:30, 9 November 2019
Numerical Optimization
Line Search Methods
Basic idea:
- For each iteration
- Find a direction \(\displaystyle p\).
- Then find a step length \(\displaystyle \alpha\) which decreases \(\displaystyle f\).
- Take a step \(\displaystyle \alpha p\).
Trust Region Methods
Basic idea:
- For each iteration
- Assume a quadratic model of your objective function near a point.
- Find a region where you trust your model accurately represents your objective function.
- Take a step.
Variables:
- \(\displaystyle f\) is your objective function.
- \(\displaystyle m_k\) is your quadratic model at iteration k.
- \(\displaystyle x_k\) is your point at iteration k.
Your model is \(\displaystyle m_k(p) = f_k + g_k^T p + \frac{1}{2}p^T B_k p\)
where \(\displaystyle g_k = \nabla f(x_k)\) and \(\displaystyle B_k\) is a symmetric matrix.
At each iteration, you solve a constrained optimization subproblem to find the best step \(\displaystyle p\).
\(\displaystyle \min_{p \in \mathbb{R}^n} m_k(p)\) such that \(\displaystyle \Vert p \Vert \lt \Delta_k \).
Cauchy Point Algorithms
The Cauchy point \(\displaystyle p_k^c = \tau_k p_k^s\)
where \(\displaystyle p_k^s\) minimizes the linear model in the trust region
\(\displaystyle p_k^s = \operatorname{argmin}_{p \in \mathbb{R}^n} f_k + g_k^Tp \) s.t. \(\displaystyle \Vert p \Vert \leq \Delta_k \)
and \(\displaystyle \tau_k\) minimizes our quadratic model along the line \(\displaystyle p_k^s\):
\(\displaystyle \tau_k = \operatorname{argmin}_{\tau \geq 0} m_k(\tau p_k^s)\) s.t. \(\displaystyle \Vert \tau p_k^s \Vert \leq \Delta_k \)
This can be written explicitly as \(\displaystyle p_k^c = - \tau_k \frac{\Delta_k}{\Vert g_K \Vert} g_k\) where \(\displaystyle \tau_k =
\begin{cases}
1 & \text{if }g_k^T B_k g_k \leq 0;\\
\min(\Vert g_k \Vert ^3/(\Delta_k g_k^T B_k g_k), 1) & \text{otherwise}
\end{cases}
\)