COBWEB

Publication Fisher/87a: Knowledge Acquisition Via Incremental Conceptual Clustering
Name COBWEB
Description COBWEB is an incremental clustering algorithm, based on probabilistic categorization trees. The search for a good clustering is guided by a qualitity measure for partitions of data.

The algorithm reads unclassified examples, given in attribute-value representation. Pure COBWEB only supports nominal attributes, later extensions incorporate numerical values as well. An incremental clustering algorithm reads one example per iteration, modifying the actual result each time.

The aim is to achieve
  • high predictability of variable values, given a cluster.
  • high predictiveness of a cluster, given variable values.

COBWEB´s makes use of the category (cluster) utility and partition utility measures. Please follow the link above for details on these measures.

For each new example e COBWEB compares the following alternatives for the actual node (starting with the root node), as long as it is no leaf. The alternative with highest partition utility is chosen.
  1. Insert e into the best successor node. Therefore e is inserted into every successor testwise, and the one where highest partition utility has been observed is chosen.
  2. Create a new leaf for e and make it a successor of the actual node.
  3. Generate a new node n, which is predecessor of the two best successors of the actual node (found in step 1) and insert n between the actual node and the two succesors. Example e is inserted into n.
  4. Choose the best successor node (found in step 1), delete it and make its successors direct successors of the actual node. Afterwards example e is inserted as in 1).
If a leaf has been reached, COBWEB creates two successors, one containing the new example, and one, containing the example of the old leaf. The old leaf becomes an inner node.
Example Languages Attribute-Value
Dm Step Clustering (Unsupervised Learning)
Method Type Algorithm
Measures Category and Partition Utility
Demos Cobweb Demo Applet