Version Spaces may be regarded as a method or as a hypothesis language itself.
In concept learning one has
- a set of classified examples and
- a set of hypotheses, which define the possible results of the algorithm.
The task of incremental concept learning demands, that the learning algorithm
reads one example per iteration and outputs a hypothesis each time. It can be assumed
that the algorithm may not store many of the seen examples, but basically just revises
the actual hypothesis according to the actual example.
One often has several different hypotheses fitting the data, and it can be an
advantage, for reasons of efficiency, to define a preference criterion, to have
a unique hypothesis at each point in time.
But in fact there often is a set of possible results, fitting the data, which
we expect to become smaller, the more examples the algorithm has read.
By using Version Spaces, one can avoid to choose a preference criterion, which
may not be justified in some situations, and instead represent the set of all
hypotheses of an underlying hypotheses language that (still) fit the data.
After each iteration step an arbitrary element of the represented set of hypotheses can
be chosen as an output of the learning algorithm.
The representation of the set can be achieved by half-ordering the hypotheses
language and by storing just the most general and the most specific hypotheses.
|