Non-parametric Regression (also known as local modelling)

Publication Aha/etal/91a: Instance-based learning algorithms
Fan/95a: Local Modelling
Loader/Cleveland/95b: Smoothing by Local Regression: Principles and Methods (with discussion)
Nadaraya/64a: On estimating regression
Parzen/62a: On estimation of a probability density function and mode
Rosenblatt/56a: Remarks on some nonparametric estimates of a density function
Watson/64a: Smooth Regression Analysis
Name Non-parametric Regression (also known as local modelling)
Description

Non-parametric regression1 belongs to a data analytic methodology usually known as local modelling (Fan, 1995). The basic idea behind local regression consists of obtaining the prediction for a data point x by fitting a parametric function in the neighbourhood of x. This means that these methods are “locally parametric” as opposed to the methods described in the previous section.

According to Cleveland and Loader (1995) local regression traces back to the 19th century. These authors provide a historical survey of the work done since then. The modern work on local modelling starts in the 1950’s with the kernel methods introduced within the probability density estimation setting (Rosenblatt,1956; Parzen,1962) and within the regression setting (Nadaraya,1964; Watson,1964). Local polynomial regression is a generalisation of this early work on kernel regression. In effect, kernel regression amounts to fitting a polynomial of degree zero (a constant) in a neighbourhood. Summarising, we can state the general goal of local regression as trying to fit a polynomial of degree p around a query point (or test case) xq using the training data in its neighbourhood. This includes the various available settings like kernel regression (p=0), local linear regression (p=1), etc2.

Local regression is strongly related to the work on instance-based learning (e.g. Aha et al.,1991), within the machine learning community. Given a case x for which we want to obtain a prediction, these methods use the training samples that are “most similar” to x to obtain a local model that is used to obtain the prediction. This type of inductive methodologies do not perform any kind of generalisation of the given data3 and “delay learning” till prediction time4.


  1. Also known as non-parametric smoothing and local regression.
  2. A further generalisation of this set-up consists of using polynomial mixing (Cleveland & Loader,1995), where p can take non-integer values.
  3. This is not completely true for some types of instance-based learners as we will see later. Nevertheless, it is true in the case of local modelling.
  4. That is the reason for also being known as lazy learners (Aha, 1997).

Method Type Method