next up previous contents
Next: Tip: Up: Setting up an experiment Previous: Operator info   Contents


The attribute editor

Example sets or instance sets in YALE are described using a separate XML document. This attribute description file contains information about the type of data and its source. Data sets can be distributed over several files. This may be particularly useful if the label is stored within a file of its own. The YALE Tutorial will give help in case you want to edit this file yourself.

The GUI displays a small Edit button next to a attribute description file property (e.g. the parameter attributes of an ExampleSource) in the property editor. A dialog called Attribute Editor will pop up containing a table with one column for each attribute (figure 8). If the property does not yet reference a proper attribute description file, the dialog will be empty. If you want to follow the instructions below which describe how to create the XML description file, you can click on clear, which is in the menue bar under table to start from scratch.

Figure 8: The Attribute Editor dialog for data loading and attribute description file creation.
\begin{figure}\center
\epsfig{file=attribute_editor.eps,width=0.88\linewidth}\end{figure}

Assume, you have a data file containing 50 rows of whitespace or comma separated attribute values, five each row. Click on load data to open that file. After that you should see five columns with some headers each and the data in the table cells. Question marks (``?'') indicate missing values. The following enumeration explains the meanings of the table headers:

  1. The first header contains the source file and column index. This is not editable but just for your information.
  2. The second row indicates, what the data is used for. It can either be an ordinary attribute, a label for classification or regression tasks, or a weight that can be used with certain algorithms. There can be at most one label and one weight attribute.
  3. The third row is the SI unit given as a sequence of of basic unit and exponent. An example: The unit Newton would be specified as kg1m1s-2, because 1N = $ {\frac{{kg\cdot m}}{{s^2}}}$. An exponent of 1 can be omitted. This feature is useful if you want additional features to be generated automatically, e.g. by a FeatureGenerationOperator operator. Units are taken into account to restrict useless attribute combinations, e.g. adding a time and a distance attribute.
  4. The fourth row is the value type. Most interesting are the choices real / integer and nominal. YALE should have automatically detected these correctly.
  5. The last header row is the block type. Most interesting are single_value (default) and value_series. For some experiments value series are treated in a special way. Do not forget to assign value_series_start and value_series_end to the first and last column respectively.
You can change the values according to your needs and load an arbitrary number of data files. Finally click on Save attribute description file, which you can find in the file menue, to write the XML file to disk.



Subsections
next up previous contents
Next: Tip: Up: Setting up an experiment Previous: Operator info   Contents
yale-team@lists.sourceforge.net