Non-parametric Regression (also known as local modelling)

Publication Aha/etal/91a: Instance-based learning algorithms
Fan/95a: Local Modelling
Loader/Cleveland/95b: Smoothing by Local Regression: Principles and Methods (with discussion)
Nadaraya/64a: On estimating regression
Parzen/62a: On estimation of a probability density function and mode
Rosenblatt/56a: Remarks on some nonparametric estimates of a density function
Watson/64a: Smooth Regression Analysis

Name Non-parametric Regression (also known as local modelling)

Description
Non-parametric regression¹ belongs to a data analytic methodology usually known as local modelling (Fan, 1995). The basic idea behind local regression consists of obtaining the prediction for a data point x by fitting a parametric function in the neighbourhood of x. This means that these methods are locally parametric as opposed to the methods described in the previous section.

According to Cleveland and Loader (1995) local regression traces back to the 19^th century. These authors provide a historical survey of the work done since then. The modern work on local modelling starts in the 1950s with the kernel methods introduced within the probability density estimation setting (Rosenblatt,1956; Parzen,1962) and within the regression setting (Nadaraya,1964; Watson,1964). Local polynomial regression is a generalisation of this early work on kernel regression. In effect, kernel regression amounts to fitting a polynomial of degree zero (a constant) in a neighbourhood. Summarising, we can state the general goal of local regression as trying to fit a polynomial of degree p around a query point (or test case) x_q using the training data in its neighbourhood. This includes the various available settings like kernel regression (p=0), local linear regression (p=1), etc².

Local regression is strongly related to the work on instance-based learning (e.g. Aha et al.,1991), within the machine learning community. Given a case x for which we want to obtain a prediction, these methods use the training samples that are most similar to x to obtain a local model that is used to obtain the prediction. This type of inductive methodologies do not perform any kind of generalisation of the given data³ and delay learning till prediction time⁴.

Also known as non-parametric smoothing and local regression.

A further generalisation of this set-up consists of using polynomial mixing (Cleveland & Loader,1995), where p can take non-integer values.

This is not completely true for some types of instance-based learners as we will see later. Nevertheless, it is true in the case of local modelling.

That is the reason for also being known as lazy learners (Aha, 1997).

Method Type Method