Next: Tilde Up: Improving the Efficiency of Previous: Level-wise Frequent Pattern Discovery

Experiments

The goal of this experimental evaluation is to empirically investigate the actual speedups that can be obtained by re-implementing ILP systems so that they use the pack execution mechanism. At this moment such re-implementations exist for the TILDE and WARMR systems, hence we have used these for our experiments. These re-implementations are available within the ACE data mining tool, available for academic use upon request.⁴We attempt to quantify (a) the speedup of packs w.r.t. to separate execution of queries (thus validating our complexity analysis), and (b) the total speedup that this can yield for an ILP system.

The data sets that we have used for our experiments are the following:

The Mutagenesis data set : an ILP benchmark data set, introduced to the ILP community by [32], that consists of structural descriptions of 230 molecules that are to be classified as mutagenic or not. Next to the standard Mutagenesis data set, we also consider versions of it where each example occurs times; this allows us to easily generate data sets of larger size where the average example and query complexity are constant and equal to those of the original data set.
Bongard data sets : introduced in ILP by [17], the so-called ``Bongard problems'' are a simplified version of problems used by [9] for research on pattern recognition. A number of drawings are shown containing each a number of elementary geometrical figures; the drawings have to be classified according to relations that hold on the figures in them. We use a Bongard problem generator to create data sets of varying size.

The experiments were run on SUN workstations: a Sparc Ultra-60 at 360 MHz for TILDE, a Sparc Ultra-10 at 333 Mhz for WARMR. TILDE and WARMR were run with their default settings, except where mentioned differently.

Subsections

Next: Tilde Up: Improving the Efficiency of Previous: Level-wise Frequent Pattern Discovery

Hendrik Blockeel 2002-02-26