The method of least squares minimisation is a well studied and well used one. It is a technique that originates with Gauss and Legendre. It consists of the following method.
Given N pairs of data and a chosen function space , one finds the function in with which minimises:
for . This is of particular interest and of use when one has error with the values , as then one does not try to perfectly fit the information given.
By considering the error in observing , we find ourselves within a probabilistic setting. Suppose the function can be described by choosing some parameters which are viewed as random, then since is considered to be a Gaussian random sample drawn from a distribution centred at the true state of the system , we have that the maximum likelihood estimation of the coincides with the minimum least squares distance, since the likelihood here is
The minimisation problem first defined of finding a function is actually minimisation problems of finding where are the components of . To implement on a computer, one must parameterise a projection of the function space onto each of its components by finitely many parameters. In other words, a basis of is chosen and then projected onto each of the coordinate axes of . Then, for each in and , one can write the functions by truncating its expansion:
where is clearly the projection of onto the coordinate. For the sake of brevity, we will suppress the index and treat as a function from to . A minimiser, if one exists, of the problem must have differential zero in , namely
and one can write this, using
using the Gauss Markov theorem.
Suppose that is invertible. Then among all unbiased linear estimators such that the one that minimises the least squares error is
Furthermore, if it is physically relevant to consider some constraint in the situation, then a constrained least squares minimisation will be more suitable. While no results are given numerically later on this, it is still a useful consideration to make. This is of the form
for some . and can be considered equivalent to
This equivalence can be seen by considering the minimisation problem
This is the situation of Tikhonov regularisation:
Let be a linear operator between Hilbert Spaces such that is a closed subspace of . Let be self adjoint and positive definite, and and be given as well. Then
Choice of Minimisation Space
An important decision to make in the minimisation is the choice of space where one minimises. One can choose a basis of this space which has local support, for example
This has advantages that the matrix as given above will be sparse, so computation is faster, although if the data that one is using is clustered in a small subset of the space one is considering then this will lead to ill posedness of the minimisation.
An alternative type of basis is one which has global support, for example
This is particularly useful if one wants to approximate the function from data clustered in a small subset of the space and extrapolate information outside of this region. However it does require more computation as in general the matrix will have all non zero entries.