Next: Tilde
Up: Improving the Efficiency of
Previous: Level-wise Frequent Pattern Discovery
The goal of this experimental evaluation is to empirically investigate
the actual speedups that can be obtained by re-implementing ILP
systems so that they use the pack execution mechanism. At this moment
such re-implementations exist for the TILDE and WARMR systems,
hence we have used these for our experiments. These re-implementations
are available within the ACE data mining tool, available for academic
use upon request.4We attempt to quantify (a) the speedup of packs w.r.t. to separate
execution of queries (thus validating our complexity analysis), and
(b) the total speedup that this can yield for an ILP system.
The data sets that we have used for our experiments are the following:
- The Mutagenesis data set : an ILP benchmark data set, introduced
to the ILP community by [32], that consists of
structural descriptions of 230 molecules that are to be classified
as mutagenic or not. Next to the standard Mutagenesis data set,
we also consider versions of it where each example occurs
times;
this allows us to easily generate data sets of larger size where the
average example and query complexity are constant and equal to those of
the original data set.
- Bongard data sets : introduced in ILP by
[17], the so-called ``Bongard problems'' are a
simplified version of problems used by [9]
for research on pattern recognition. A number of drawings are shown
containing each a number of elementary geometrical figures; the drawings
have to be classified according to relations that hold on the figures in
them. We use a Bongard problem generator to create data sets of varying
size.
The experiments were run on SUN workstations: a Sparc Ultra-60
at 360 MHz for TILDE, a Sparc Ultra-10 at 333 Mhz for WARMR.
TILDE and WARMR were run with their default settings, except where
mentioned differently.
Subsections
Next: Tilde
Up: Improving the Efficiency of
Previous: Level-wise Frequent Pattern Discovery
Hendrik Blockeel
2002-02-26