Next: Tip:
Up: Setting up an experiment
Previous: Operator info
Contents
The attribute editor
Example sets or instance sets in YALE are described using a separate XML
document. This attribute description file contains information about
the type of data and its source. Data sets can be distributed over
several files. This may be particularly useful if the label is stored
within a file of its own. The YALE Tutorial will give help in case you
want to edit this file yourself.
The GUI displays a small Edit button next to a attribute
description file property (e.g. the parameter attributes of an
ExampleSource) in the property editor. A dialog called
Attribute Editor will pop up containing a table with one column for
each attribute (figure 8). If the property
does not yet reference a proper attribute description file, the dialog
will be empty. If you want to follow the instructions below which describe
how to create the XML description file, you can click on clear,
which is in the menue bar under table to start from scratch.
Figure 8:
The Attribute Editor dialog for data loading and attribute
description file creation.
 |
Assume, you have a data file containing 50 rows of whitespace or comma
separated attribute values, five each row. Click on load data to open
that file. After that you should see five columns with some headers each
and the data in the table cells. Question marks (``?'') indicate missing
values. The following enumeration explains the meanings of the table headers:
- The first header contains the source file and column index. This
is not editable but just for your information.
- The second row indicates, what the data is used for. It can
either be an ordinary attribute, a label for classification or
regression tasks, or a weight that can be used with certain
algorithms. There can be at most one label and one weight attribute.
- The third row is the SI unit given as a sequence of of basic
unit and exponent. An example: The unit Newton would be specified as
kg1m1s-2, because
1N =
. An
exponent of 1 can be omitted. This feature is useful if you want
additional features to be generated automatically, e.g. by a
FeatureGenerationOperator operator. Units are taken into account
to restrict useless attribute combinations, e.g. adding a time and a
distance attribute.
- The fourth row is the value type. Most interesting are
the choices real / integer and nominal. YALE
should have automatically detected these correctly.
- The last header row is the block type. Most interesting are
single_value (default) and value_series. For some
experiments value series are treated in a special way. Do not forget
to assign value_series_start and value_series_end to
the first and last column respectively.
You can change the values according to your needs and load an
arbitrary number of data files. Finally click on Save
attribute description file, which you can find in the file menue,
to write the XML file to disk.
Subsections
Next: Tip:
Up: Setting up an experiment
Previous: Operator info
Contents
yale-team@lists.sourceforge.net