|
Function Approximation
Publication |
Draper/Smith/73a: Applied Regression Analysis
|
Name |
Function Approximation |
Description |
Many general learning tasks, especially concept learning, may be regarded as function
approximation.
Given are examples of the function to be approximated.
The aim is to find a hypothesis (a function as well), that can be used for predicting the function
values of yet unseen instances, e.g. to predict future events.
Good hypotheses are those, often predicting an accurate function value.
The quality of a hypothesis for approximating a specific function is measured by a so
called loss function, increasing as the differences between predicted and true
function values increase. It is also taken into account, that it is more important to predict well
on frequently observed instances.
|
Definition |
Given
-
a target function t: X → ℝ, mapping
instances of a domain X to real values.
-
a probability distribution P over the domain
X, indicating the probability each instance x ∈
X is drawn.
-
an oracle, producing a finite set E of
examples, where E ⊂ X x ℝ.
Each example e ∈ E is generated like this:
-
An instance x ∈ X is drawn with respect
to P.
-
The value of y is chosen as t(x), the target
function value of x.
At this point noisy examples might be modelled by
allowing y ≠ t(x), e.g. y ≈ t(x), where
y deviates randomly from t(x).
-
Output e := (x, y).
-
a language LH for hypotheses.Each single
hypothesis h is a mapping h: X → ℝ.
Then the aim of function approximation is:
Find a hypothesis h ∈ LH minimizing the expected
error (or expected risk), defined as
|E| |
|
∑ |
P(xi) * Q(xi, h) ,
|
i=1 |
|
where xi denotes the instance of example number
i, and P(xi) the probability for xi
to be drawn.
So it is more important to make less mistakes on more
probable examples than on less probable ones.
Q(xi, h) is a loss function, describing
the quality of the prediction of hypothesis h on example
xi.
There are two subtasks, depending on the structure of
Q:
-
Classification: The examples are
divided into a given set of classes.
Usually the following function is applied:
Q(x, h) := |
1 if h(x) ≠ t(x) |
0 if h(x) = t(x) |
Note that, applying this measure, we have reached
the PAC-version of the task of concept learning!
-
Regression: A real value function
shall be approximated.
Q is often chosen as the squared difference
between the predicted value h(x) and the
correct value t(x):
Q(x, h) := (t(x) - h(x))2
|
Methods |
Lazy Learning
Neural Nets
Support Vector Machine (SVM)
Top-Down Induction of Regression Trees
|
Theories |
Probably Approximately Correct (PAC-) Learning
Statistical Learning
|
|
|