The Levenberg-Marquard Algorithm

The problem of least squares is widespread in applications amongst almost all natural sciences. The setting is a follows: We observe input data $\{x_i\}_{i=1,\ldots,n}$ and some output data $\{y_i\}_{i=1,\ldots,n}$ together with a model represented by a function $y=f(x,\theta)$ which depends on a set of parameters $\theta\in\Theta$ . The task now is to find the parameter $\hat\theta$ which minimizes

$J=\sum_{i=1}^n|y_i-f(x,\theta)|^2\, .$

The case where $f$ is linear has been addressed first by Legendre although Gauss claimed that he have been had a solution earlier. The optimal parameter $\hat\theta$ can be explicitly given in terms of the data.

In case that $f$ is non-linear, it usually is not possible to solve for $\hat\theta$ directly. The first key concept of the method is to linearize the non-linear problem through a Taylor expansion, i.e.

$f(x,\theta+h)=f(x,\theta)+\partial_\theta f(x,\theta)h\, .$

One can then solve the linearized problem recursively, until some condition is fulfilled. In each step, the Levenberg-Maquard algorithm "interpolates" between the gradient-descend method and the Gauss-Newton algorithm.