How does the Levenberg–Marquardt algorithm work in detail but in an understandable way?

Minimizing a function is like trying to find lowest point on a surface. Think of yourself walking on a hilly surface and that you are trying to get to the lowest point. You would find the direction that goes downhill and walk until it doesn't go downhill anymore. Then you would chose a new direction that goes downhill and walk in that direction until it doesn't go downhill anymore, and so on. Eventually (hopefully) you would reach a point where no direction goes downhill anymore. You would then be at a (local) minimum.

The LM algorithm, and many other minimization algorithms, use this scheme.

Suppose that the function being minimized is F and we are at the point x(n) in our iteration. We wish to find the next iterate x(n+1) such that F(x(n+1)) < F(x(n)), i.e. the function value is smaller. In order to chose x(n+1) we need two things, a direction from x(n) and a step size (how far to go in that direction). The LM algorithm determines these values as follows -

First, compute a linear approximation to F at the point x(n). It is easy to find out the downhill direction of a linear function, so we use the linear approximating function to determine the downhill direction. Next, we need to know how far we can go in this chosen direction. If our approximating linear function is a good approximation for F for a large area around x(n), then we can take a fairly large step. If it's a good approximation only very close to x(n), then we can take only a very small step.

This is what LM does - calculates a linear approximation to F at x(n), thus giving the downhill direction, then it figures out how big a step to take based on how well the linear function approximates F at x(n). LM figures out how good the approximating function is by basically taking a step in the direction thus determined and comparing how much the linear approximation to F decreased to the how much the the actual function F decreased. If they are close, the approximating function is good and we can take a little larger step. If they are not close then the approximation function is not good and we should back off and take a smaller step.


  • Try http://en.wikipedia.org/wiki/Levenberg–Marquardt_algorithm
  • PDF Tutorial from Ananth Ranganathan
  • JavaNumerics has a pretty readable implementation
  • The ICS has a C/C++ implementation

The basic ideas of the LM algorithm can be explained in a few pages - but for a production-grade implementation that is fast and robust, many subtle optimizations are necessary. State of the art is still the Minpack implementation by Moré et al., documented in detail by Moré 1978 (http://link.springer.com/content/pdf/10.1007/BFb0067700.pdf) and in the Minpack user guide (http://www.mcs.anl.gov/~more/ANL8074b.pdf). To study the code, my C translation (https://jugit.fz-juelich.de/mlz/lmfit) is probably more accessible than the original Fortran code.