Description |
The task of clustering is to structure a given set of unclassified instances of an example language by creating
concepts, based on similarities found on the training data.
So the main difference to supervised learning is, that there is neither a target predicate nor an oracle,
dividing the instances of the training set into categories. The categories are formed by the learner itself.
Given: A set of (unclassified) instances of an example language LE.
Find a set of concepts that cover all given examples, such that
- the similarity between examples of the same concepts is maximized,
- the similarity between examples of different concepts is minimized.
Conceptual Clustering
The setting above just aims at finding subsets of similar examples.
Conceptual Clustering extends this task to finding intensional
descriptions of these subsets. This can be seen as a second learning
step, although it will not necessarily be split from the first one:
- Partion the example set, optimizing a measure based on
similarity of the instances within the same subsets.
- Perform concept learning (supervised learning) for each of the
found subsets, to turn the extensional description into an intensional
one.
Note that the second step allows for prediction of yet unseen
instances of the example language.
One method addressing the task of Conceptual Clustering is the
Star method. COBWEB is an example of a clustering
algorithm, which does not induce an intesional description of
the found clusters, but organizes them in a tree structure.
|