# Least Squares Approximation

The method of least squares minimisation is a well studied and well used one. It is a technique that originates with Gauss and Legendre. It consists of the following method.

Given N pairs of data and a chosen function space , one finds the function in with which minimises:

for . This is of particular interest and of use when one has error with the values , as then one does not try to perfectly fit the information given.

By considering the error in observing , we find ourselves within a probabilistic setting. Suppose the function can be described by choosing some parameters which are viewed as random, then since is considered to be a Gaussian random sample drawn from a distribution centred at the true state of the system , we have that the maximum likelihood estimation of the coincides with the minimum least squares distance, since the likelihood here is

The minimisation problem first defined of finding a function is actually minimisation problems of finding where are the components of . To implement on a computer, one must parameterise a projection of the function space onto each of its components by finitely many parameters. In other words, a basis of is chosen and then projected onto each of the coordinate axes of . Then, for each in and , one can write the functions by truncating its expansion:

where is clearly the projection of onto the coordinate. For the sake of brevity, we will suppress the index and treat as a function from to . A minimiser, if one exists, of the problem must have differential zero in , namely

and one can write this, using

as

using the Gauss Markov theorem.

**Gauss-Markov Theorem**

Suppose that is invertible. Then among all unbiased linear estimators such that the one that minimises the least squares error is

### Constrained Minimisation

Furthermore, if it is physically relevant to consider some constraint in the situation, then a constrained least squares minimisation will be more suitable. While no results are given numerically later on this, it is still a useful consideration to make. This is of the form

for some . and can be considered equivalent to

This equivalence can be seen by considering the minimisation problem

This is the situation of Tikhonov regularisation:

**Tikhonov Regularisation**

Let be a linear operator between Hilbert Spaces such that is a closed subspace of . Let be self adjoint and positive definite, and and be given as well. Then

### Choice of Minimisation Space

An important decision to make in the minimisation is the choice of space where one minimises. One can choose a basis of this space which has local support, for example

This has advantages that the matrix as given above will be sparse, so computation is faster, although if the data that one is using is clustered in a small subset of the space one is considering then this will lead to ill posedness of the minimisation.

An alternative type of basis is one which has global support, for example

This is particularly useful if one wants to approximate the function from data clustered in a small subset of the space and extrapolate information outside of this region. However it does require more computation as in general the matrix will have all non zero entries.