edu.udo.cs.yale.operator.learner.meta
Class SDRulesetInduction

java.lang.Object
  extended by edu.udo.cs.yale.operator.Operator
      extended by edu.udo.cs.yale.operator.OperatorChain
          extended by edu.udo.cs.yale.operator.learner.meta.SDRulesetInduction
All Implemented Interfaces:
ConfigurationListener

public class SDRulesetInduction
extends OperatorChain

Subgroup discovery learner.

Version:
$Id: SDRulesetInduction.java,v 1.20 2006/04/05 08:57:26 ingomierswa Exp $
Author:
Martin Scholz

Field Summary
private  int currentIteration
           
static java.lang.String GAMMA
          Boolean parameter to specify whether the label priors should be equally likely after first iteration.
static java.lang.String INTERNAL_BOOTSTRAP
          Name of the flag indicating internal bootstrapping.
static double MIN_ADVANTAGE
          Discard models with an advantage of less than the specified value.
static java.lang.String NUM_OF_ITERATIONS
          Name of the variable specifying the maximal number of iterations of the learner.
private  double performance
           
static java.lang.String REWEIGHT
          Boolean parameter: true for additive reweighting, false for multiplicative.
static java.lang.String ROC_FILTER
          A parameter whether to discard all rules not lying on the convex hull in ROC space.
static java.lang.String TIMES_COVERED
          Name of special attribute counting the times an example has been covered by a rule.
 
Constructor Summary
SDRulesetInduction(OperatorDescription description)
          Constructor.
 
Method Summary
 IOObject[] apply()
          Constructs a Model repeatedly running a weak learner, reweighting the training example set accordingly, and combining the hypothesis using the available weighted performance values.
private  void debugMessage(SDReweightMeasures wp)
           
 InnerOperatorCondition getInnerOperatorCondition()
          Must return a condition of the IO behaviour of all desired inner operators.
 java.lang.Class[] getInputClasses()
          Returns the classes that are needed as input.
 int getMaxNumberOfInnerOperators()
          Returns the maximum number of innner operators.
 int getMinNumberOfInnerOperators()
          Returns the minimum number of innner operators.
 int getNumberOfSteps()
          Returns the number of steps performed by this chain.
 java.lang.Class[] getOutputClasses()
          Returns the classes that are guaranteed to be returned by apply() as additional output.
 java.util.List<ParameterType> getParameterTypes()
          Adds the parameters "number of iterations" and "model file".
static int getPosIndex(Attribute label)
           
private  boolean isOnConvexHull(java.util.List<double[]> rocCurve, double tpr, double fpr)
           
private  double[] prepareWeights(ExampleSet exampleSet)
          Creates a weight attribute if not yet done and fills it with an initial value so that positive and negative examples are equally probable.
private  PredictionModel trainModel(ExampleSet exampleSet)
          Runs the "embedded" learner on the example set and retuns a model.
private  SDEnsemble trainRuleset(ExampleSet trainingSet, double[] classPriors)
          Main method for training the ensemble classifier
 
Methods inherited from class edu.udo.cs.yale.operator.OperatorChain
addAddListener, addOperator, addOperator, checkDeprecations, checkIO, checkNumberOfInnerOperators, checkProperties, clearErrorList, clearStepCounter, cloneOperator, countStep, createExperimentTree, delete, experimentFinished, experimentStarts, getAllInnerOperators, getCurrentStep, getIndexOfOperator, getInnerOperatorForName, getInnerOperatorsXML, getNumberOfAllOperators, getNumberOfChildrensSteps, getNumberOfOperators, getOperator, getOperatorFromAll, getOperators, isEnabled, performAdditionalChecks, removeAddListener, removeOperator, setEnabled, setExperiment, shouldReturnInnerOutput
 
Methods inherited from class edu.udo.cs.yale.operator.Operator
addError, addValue, addWarning, apply, createExperimentTree, createFromXML, createMarkedExperimentTree, getAddOnlyAdditionalOutput, getApplyCount, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getErrorList, getExperiment, getInput, getInput, getInput, getInputDescription, getIOContainerForInApplyLoopBreakpoint, getName, getOperatorClassName, getOperatorDescription, getParameter, getParameterAsBoolean, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsInt, getParameterAsString, getParameterList, getParameters, getParameterType, getParent, getStartTime, getStatus, getUserDescription, getValue, getValues, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isParameterSet, logMessage, register, remove, rename, resume, setBreakpoint, setInput, setListParameter, setOperatorParameters, setParameter, setParameters, setParent, setUserDescription, toString, writeXML
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

NUM_OF_ITERATIONS

public static final java.lang.String NUM_OF_ITERATIONS
Name of the variable specifying the maximal number of iterations of the learner.

See Also:
Constant Field Values


INTERNAL_BOOTSTRAP

public static final java.lang.String INTERNAL_BOOTSTRAP
Name of the flag indicating internal bootstrapping.

See Also:
Constant Field Values


ROC_FILTER

public static final java.lang.String ROC_FILTER
A parameter whether to discard all rules not lying on the convex hull in ROC space.

See Also:
Constant Field Values


REWEIGHT

public static final java.lang.String REWEIGHT
Boolean parameter: true for additive reweighting, false for multiplicative.

See Also:
Constant Field Values


GAMMA

public static final java.lang.String GAMMA
Boolean parameter to specify whether the label priors should be equally likely after first iteration.

See Also:
Constant Field Values


TIMES_COVERED

public static final java.lang.String TIMES_COVERED
Name of special attribute counting the times an example has been covered by a rule. This attribute is created for additive reweighting, only.

See Also:
Constant Field Values


MIN_ADVANTAGE

public static final double MIN_ADVANTAGE
Discard models with an advantage of less than the specified value.

See Also:
Constant Field Values


performance

private double performance

currentIteration

private int currentIteration
Constructor Detail

SDRulesetInduction

public SDRulesetInduction(OperatorDescription description)
Constructor.

Method Detail

getInnerOperatorCondition

public InnerOperatorCondition getInnerOperatorCondition()
Description copied from class: OperatorChain
Must return a condition of the IO behaviour of all desired inner operators. If there are no "special" conditions and the chain works similar to a simple operator chain this method should at least return a SimpleChainInnerOperatorCondition. More than one condition should be combined with help of the class CombinedInnerOperatorCondition.

Specified by:
getInnerOperatorCondition in class OperatorChain


getMaxNumberOfInnerOperators

public int getMaxNumberOfInnerOperators()
Description copied from class: OperatorChain
Returns the maximum number of innner operators.

Specified by:
getMaxNumberOfInnerOperators in class OperatorChain
See Also:
OperatorChain.getMaxNumberOfInnerOperators()


getMinNumberOfInnerOperators

public int getMinNumberOfInnerOperators()
Description copied from class: OperatorChain
Returns the minimum number of innner operators.

Specified by:
getMinNumberOfInnerOperators in class OperatorChain
See Also:
OperatorChain.getMinNumberOfInnerOperators()


getNumberOfSteps

public int getNumberOfSteps()
Description copied from class: OperatorChain
Returns the number of steps performed by this chain.

Specified by:
getNumberOfSteps in class OperatorChain
See Also:
OperatorChain.getNumberOfSteps()


getInputClasses

public java.lang.Class[] getInputClasses()
Description copied from class: Operator
Returns the classes that are needed as input. May be null or an empty (no desired input). As default, all delivered input objects are consumed and must be also delivered as output in both Operator.getOutputClasses() and Operator.apply() if this is necessary. This default behavior can be changed by overriding Operator.getInputDescription(Class). Subclasses which implement this method should not make use of parameters since this method is invoked by getParameterTypes(). Therefore, parameters are not fully available at this point of time and this might lead to exceptions. Please use InputDescriptions instead.

Specified by:
getInputClasses in class Operator
See Also:
Operator.getInputClasses()


getOutputClasses

public java.lang.Class[] getOutputClasses()
Description copied from class: Operator
Returns the classes that are guaranteed to be returned by apply() as additional output. Please note that input object which should not be consumed must also be defined by this method (e.g. for preprocessing operators). The default behavior for input consumation is defined by Operator.getInputDescription(Class) and can be changed by overwriting this method. Objects which are not consumed must not be defined as additional output in this method. May be null or an empy array (no additional output is produced).

Specified by:
getOutputClasses in class Operator
See Also:
Operator.getOutputClasses()


getPosIndex

public static int getPosIndex(Attribute label)

prepareWeights

private double[] prepareWeights(ExampleSet exampleSet)
                         throws OperatorException
Creates a weight attribute if not yet done and fills it with an initial value so that positive and negative examples are equally probable.

Parameters:
exampleSet - the example set to be prepared
Throws:
OperatorException


trainModel

private PredictionModel trainModel(ExampleSet exampleSet)
                            throws OperatorException
Runs the "embedded" learner on the example set and retuns a model.

Parameters:
exampleSet - an ExampleSet to train a model for
Returns:
a Model
Throws:
OperatorException


apply

public IOObject[] apply()
                 throws OperatorException
Constructs a Model repeatedly running a weak learner, reweighting the training example set accordingly, and combining the hypothesis using the available weighted performance values. If the input contains a model, then this model is used as a starting point for weighting the examples.

Overrides:
apply in class OperatorChain
Returns:
the last inner operator's output or the input itself if the chain is empty.
Throws:
OperatorException


trainRuleset

private SDEnsemble trainRuleset(ExampleSet trainingSet,
                                double[] classPriors)
                         throws OperatorException
Main method for training the ensemble classifier

Throws:
OperatorException


debugMessage

private void debugMessage(SDReweightMeasures wp)

isOnConvexHull

private boolean isOnConvexHull(java.util.List<double[]> rocCurve,
                               double tpr,
                               double fpr)

getParameterTypes

public java.util.List<ParameterType> getParameterTypes()
Adds the parameters "number of iterations" and "model file".

Overrides:
getParameterTypes in class Operator



Copyright © 2001-2006