Numerical Optimization: Difference between revisions

 
(One intermediate revision by the same user not shown)
Line 50: Line 50:
==Conjugate Gradient Methods==
==Conjugate Gradient Methods==
[https://www.cs.cmu.edu/~15859n/RelatedWork/painless-conjugate-gradient.pdf Painless Conjugate Gradient]<br>
[https://www.cs.cmu.edu/~15859n/RelatedWork/painless-conjugate-gradient.pdf Painless Conjugate Gradient]<br>
The goal is to solve <math>Ax=b</math> or equivalently <math>\min \phi(x)</math> where <math>\phi(x)=(1/2)x^T A x - b^Tx</math><br>.
The goal is to solve <math>Ax=b</math> or equivalently <math>\min \phi(x)</math> where <math>\phi(x)=(1/2)x^T A x - b^Tx</math>.<br>
The practical CG algorithm will converge in at most <math>n</math> steps.
The practical CG algorithm will converge in at most <math>n</math> steps.
===Definitions===
===Definitions===
Vectors <math>\{p_i\}</math> are conjugate w.r.t SPD matrix <math>A</math> if <math>p_i^T A p_j = 0</math> for <math>i \neq j</math>
Vectors <math>\{p_i\}</math> are conjugate w.r.t SPD matrix <math>A</math> if <math>p_i^T A p_j = 0</math> for <math>i \neq j</math>
;Notes
;Notes
* <math>\{p_i\}</math> are linearly independent
* <math>\{p_i\}</math> are linearly independent
Line 63: Line 62:
* Take the step size <math>\alpha_k</math> which minimizes along <math>\phi(x)</math>
* Take the step size <math>\alpha_k</math> which minimizes along <math>\phi(x)</math>
Practical CG method:<br>
Practical CG method:<br>
{{ hidden | Code |
Below is code for the practical CG method.<br>
<pre>
<pre>
 
x = x0;
r = A*x - b;
p = -r;
k = 1;
while(norm(r) > tole && k < size(x,2))
  Ap = A*p;
  alpha = (r'*r)/(p'*Ap);
  x = x + alpha * p;
  rn = r + alpha * Ap;
  beta = (rn'*rn)/(r'*r);
  p = -rn + beta * p;
  r = rn;
  k = k+1;
end
</pre>
</pre>
}}


===Theorems===
===Theorems===
Line 73: Line 88:
; Theorem: At step k, the residual <math>r_k</math> is orthogonal to <math>\{p_0,...p_{k-1}\}</math> and the current iteration <math>x_{k}</math>  
; Theorem: At step k, the residual <math>r_k</math> is orthogonal to <math>\{p_0,...p_{k-1}\}</math> and the current iteration <math>x_{k}</math>  
: minimizes <math>\phi(x)</math> over the set <math>\{x_0 + span(p_0,...,p_{k-1})\}</math>
: minimizes <math>\phi(x)</math> over the set <math>\{x_0 + span(p_0,...,p_{k-1})\}</math>
===Convergence Rate===
The convergence rate can be estimated by:<br>
<math>\min_{P_k} \max_{1 \leq i \leq k} [1 + \lambda_i P_k(\lambda_i)]^2</math>


==Resources==
==Resources==
* [https://link.springer.com/book/10.1007%2F978-0-387-40065-5 Numerical Optimization by Nocedal and Wright (2006)]
* [https://link.springer.com/book/10.1007%2F978-0-387-40065-5 Numerical Optimization by Nocedal and Wright (2006)]