|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.udo.cs.yale.operator.Operator
edu.udo.cs.yale.operator.io.ResultSetExampleSource
edu.udo.cs.yale.operator.io.DatabaseExampleSource
public class DatabaseExampleSource
This operator reads an ExampleSet
from an SQL
database. The SQL query can be passed to Yale via a parameter or, in case of
long SQL statements, in a separate file. Please note that column names are
often case sensitive. Databases may behave differently here.
The most convenient way of defining the necessary parameters is the configuration wizard. The most important parameters (database URL and user name) will be automatically determined by this wizard and it is also possible to define the special attributes like labels or ids.
Please note that this operator supports two basic working modes:
The latter possibility will be turned on by the parameter "work_on_database". Please note that this working mode is still regarded as experimental and errors might occur. In order to ensure proper data changes the database working mode is only allowed on a single table which must be defined with the parameter "table_name". If you encounter during data updates (e.g. messages that the result set is not updatable) you have to define a primary key for your table.
If you are not directly working on the database the data will be read with an arbitrary SQL query statement (SELECT ... FROM ... WHERE ...) defined by "query" or "query_file". The memory mode is the recommended way of using this operator. This is especially important for following operators like learning schemes which would often load (most of) the data into main memory during the learning process.
ResultSetMetaData
interface does not provide
information about the possible values of nominal attributes, the internal
indices the nominal values are mapped to, will be dependent on the ordering
they appear in the table. This may cause problems only when experiments are
split up into training a experiment and an application or testing experiment.
For learning schemes which are capable of handling nominal attributes, this
is not a problem. If a learning scheme like a SVM is used with nominal data,
Yale pretends that nominal attributes are numerical and uses indices for the
nominal values as their numerical value. A SVM may perform well if there are
only two possible values. If a test set is read in another experiment, the
nominal values may be assigned different indices, and hence the SVM trained
is useless. This is not a problem for label attributes, since the classes can
be specified using the classes
parameter and hence, all
learning schemes intended to use with nominal data are safe to use.
Field Summary | |
---|---|
private DatabaseHandler |
dbAccess
|
Constructor Summary | |
---|---|
DatabaseExampleSource(OperatorDescription description)
|
Method Summary | |
---|---|
IOObject[] |
apply()
Implement this method in subclasses. |
private void |
disconnect()
|
void |
experimentFinished()
Called at the end of the experiment. |
private DatabaseHandler |
getConnectedDatabaseHandler()
|
java.util.List<ParameterType> |
getParameterTypes()
Returns a list of ParameterTypes describing the parameters of this operator. |
private java.lang.String |
getQuery()
|
java.sql.ResultSet |
getResultSet()
This method reads the file whose name is given, extracts the database access information and the query from it and executes the query. |
void |
setNominalValues(java.util.List attributeList,
java.sql.ResultSet resultSet,
Attribute label)
Since the ResultSet does not provide information about possible
values of nominal attributes, subclasses must set these by implementing
this method. |
private void |
setNominalValuesForLabel(Attribute label)
|
Methods inherited from class edu.udo.cs.yale.operator.io.ResultSetExampleSource |
---|
createExampleSet, getInputClasses, getOutputClasses |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
private DatabaseHandler dbAccess
Constructor Detail |
---|
public DatabaseExampleSource(OperatorDescription description)
Method Detail |
---|
public void setNominalValues(java.util.List attributeList, java.sql.ResultSet resultSet, Attribute label) throws UndefinedParameterError
ResultSetExampleSource
ResultSet
does not provide information about possible
values of nominal attributes, subclasses must set these by implementing
this method.
setNominalValues
in class ResultSetExampleSource
attributeList
- List of Attribute
UndefinedParameterError
private void setNominalValuesForLabel(Attribute label) throws UndefinedParameterError
UndefinedParameterError
public IOObject[] apply() throws OperatorException
Operator
apply
in class ResultSetExampleSource
OperatorException
private java.lang.String getQuery() throws OperatorException
OperatorException
private DatabaseHandler getConnectedDatabaseHandler() throws OperatorException, java.sql.SQLException
OperatorException
java.sql.SQLException
public java.sql.ResultSet getResultSet() throws OperatorException
getResultSet
in class ResultSetExampleSource
OperatorException
public void experimentFinished()
Operator
experimentFinished
in class Operator
private void disconnect()
public java.util.List<ParameterType> getParameterTypes()
Operator
getParameterTypes
in class ResultSetExampleSource
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |