Category and Partition Utility

Publication Fisher/87a: Knowledge Acquisition Via Incremental Conceptual Clustering

Name Category and Partition Utility

Description
In order to achieve

high predictability of variable values, given a cluster, and

high predictiveness of a cluster, given variable values,

the clustering algorithm COBWEB measures the utility of a cluster (Clustering Utility) as:
CU(C_k) = P(C_k) * ∑_i∑_j [P(A_i = V_ij|C_k)² - P(A_i = V_ij)²]

The utility of a partition of data (Partition Utility) is defined as:
PU({C₁, ..., C_N}) = ∑_kCU(C_k) / N
The aim of COBWEB is to hierarchically cluster the given observations (unclasssified examples) in such a way, that Partition Utility is maximized at each level. Please refer to the COBWEB-Link above for a more detailed description.

Algorithm COBWEB