Numerical Optimization

From David's Wiki
Jump to navigation Jump to search

Numerical Optimization

Convergence Rates

The rate of convergence is
Iterative methods have the following convergence rates:

  • If Q=1 and L=1 we have sublinear convergence.
  • If Q=1 and we have linear convergence.
  • If Q=1 and we have superlinear convergence.
  • If Q=2 and we have quadratic convergence.

Line Search Methods

Basic idea:

  • For each iteration
    • Find a direction .
    • Then find a step length which decreases .
    • Take a step .

Trust Region Methods

Basic idea:

  • For each iteration
    • Assume a quadratic model of your objective function near a point.
    • Find a region where you trust your model accurately represents your objective function.
    • Take a step.


  • is your objective function.
  • is your quadratic model at iteration k.
  • is your point at iteration k.

Your model is where and is a symmetric matrix.
At each iteration, you solve a constrained optimization subproblem to find the best step .
such that .

Cauchy Point Algorithms

The Cauchy point
where minimizes the linear model in the trust region
and minimizes our quadratic model along the line :
This can be written explicitly as where

Conjugate Gradient Methods

Painless Conjugate Gradient
The goal is to solve or equivalently where .
The practical CG algorithm will converge in at most steps.


Vectors are conjugate w.r.t SPD matrix if for

  • are linearly independent


Basic idea:

  • Find a conjugate direction
  • Take the step size which minimizes along

Practical CG method:


Below is code for the practical CG method.

x = x0;
r = A*x - b;
p = -r;
k = 1;
while(norm(r) > tole && k < size(x,2))
   Ap = A*p;
   alpha = (r'*r)/(p'*Ap);
   x = x + alpha * p;
   rn = r + alpha * Ap;
   beta = (rn'*rn)/(r'*r);
   p = -rn + beta * p;
   r = rn;
   k = k+1;


Given a set of conjugate directions we can generate a sequence of with
where minimizes

For an , the sequence converges to the solution in at most n steps.
At step k, the residual is orthogonal to and the current iteration
minimizes over the set

Convergence Rate

The convergence rate can be estimated by: