Least Squares Linear Regression

Publication Draper/Smith/73a: Applied Regression Analysis
Press/92a: Numerical Recipes in C
Name Least Squares Linear Regression
Description

Global parametric methods try to fit a single functional model to all the given training data. This imposes a strong assumption on the form of the unknown regression function, which may lead to lower accuracy if that is not the case. However, these approaches usually fit simple functional models that are easily interpreted and have fast computational solutions.

A classical example of a global approach is the widely used linear regression model that is usually obtained using a Least Squares Error Criterion. Fitting a global linear polynomial using this criterion consists of finding the vector of parameters β that minimises the sum of the squared error, i.e. (Y – Xβ)’ (Y-Xβ), where X’ denotes the transpose of matrix X. After some matrix algebra the minimisation of this expression with respect to β leads to the following set of equations, usually referred to as the normal equations (e.g. Draper & Smith, 1981),

(X’X)β = X’Y

The parameter values can be obtained solving the equation,

β = (X’X)-1X’Y

where Z-1 denotes the inverse of matrix Z.

As the inverse matrix does not always exist this process suffers from numerical instability. A better alternative (Press et al., 1992) is to use a set of techniques known as Singular Value Decomposition (SVD), that can be used to find solutions of systems of equations with the form Xβ = Y.

There are many variants of this general set-up that differ in the way they develop/tune these models to the data (e.g. Draper & Smith, 1981).

Method Type Method