edu.udo.cs.yale.operator.learner.meta
Class BayBoostStream

java.lang.Object
  extended by edu.udo.cs.yale.operator.Operator
      extended by edu.udo.cs.yale.operator.OperatorChain
          extended by edu.udo.cs.yale.operator.learner.meta.AbstractMetaLearner
              extended by edu.udo.cs.yale.operator.learner.meta.BayBoostStream
All Implemented Interfaces:
ConfigurationListener, Learner

public class BayBoostStream
extends AbstractMetaLearner

Assumptions:

  1. target label is always boolean
  2. goal is to fit a crisp ensemble classifier (use_distribution always off)
  3. base classifier weights are always adapted by a single row from first to last
  4. no internal bootstrapping
  5. Version:
    $Id: BayBoostStream.java,v 1.37 2006/09/30 00:05:30 ingomierswa Exp $
    Author:
    Martin Scholz

    Nested Class Summary
     class BayBoostStream.BatchFilterCondition
              Class that filters an ExampleSet by the value of a special attribute.
     
    Field Summary
    static java.lang.String BATCH_SIZE
              Name of the variable specifying the maximal number of iterations of the learner.
    private  int currentIteration
               
    static java.lang.String EQUALLY_PROB_LABELS
              Boolean parameter to specify whether the label priors should be equally likely after first iteration.
    static java.lang.String HOLD_OUT_RATIO
              Parameter name to activate a hold out set for tuning.
    static double MIN_ADVANTAGE
              Discard models with an advantage of less than the specified value.
    static double MIN_LIFT_RATIO_SOFT_CLASSIFIER
              The probabilistic prediction of soft classifiers is restricted, similar to a confidence bound.
    private  double[] oldWeights
               
    private  double performance
               
    private  RunVector runVector
               
    static java.lang.String STREAM_CONTROL_ATTRIB_NAME
              Name of the special attribute with additional stream control information.
     
    Constructor Summary
    BayBoostStream(OperatorDescription description)
              Constructor.
     
    Method Summary
    private  boolean adjustBaseModelWeights(ExampleSet exampleSet, java.util.Vector<BayBoostBaseModelInfo> modelInfo)
              This helper method takes as input the traing set and the set of models trained so far.
     IOObject[] apply()
              Overwrite to also return the performance (run-) vector
    private static void createOrReplacePredictedLabelFor(ExampleSet exampleSet, Model model)
              Helper method replacing Model.createPredictedLabel(ExampleSet) in order to lower memory consumption.
    private static void debugMessage(WeightedPerformanceMeasures wp)
               
    private  double evaluatePredictions(ExampleSet exampleSet)
              returns the accuracy of the predictions for the given example set
     int getNumberOfSteps()
              Returns the number of steps performed by this chain.
     java.lang.Class[] getOutputClasses()
              Overwrite to also return the performance (run-) vector
     java.util.List<ParameterType> getParameterTypes()
              Adds the parameters "rescale label priors" and "weighted batch size".
    private  boolean isModelUseful(ContingencyMatrix cm)
              Helper method to decide whether a model improves the training error enough to be considered.
     Model learn(ExampleSet exampleSet)
              Constructs a Model repeatedly running a weak learner, reweighting the training example set accordingly, and combining the hypothesis using the available weighted performance values.
    private  double[] prepareBatch(int currentBatchNum, java.util.Iterator<Example> reader, Attribute batchAttribute)
              The preparation part collecting the examples of a batch, computing priors and resetting weights to 1.
    private  double[] prepareExtendedBatch(ExampleSet extendedBatch)
              Similar to prepareBatch, but for extended batches.
    protected  void prepareWeights(ExampleSet exampleSet)
               
    private  void rescalePriors(ExampleSet exampleSet, double[] classPriors)
              Computes the weighted class priors of the boolean target attribute and shifts weights so that the priors are equal afterwards.
    private  void restoreOldWeights(ExampleSet exampleSet)
               
    private  BayBoostModel retrainLastWeight(BayBoostModel ensemble, ExampleSet exampleSet, java.util.Vector holdOutSet)
               
     boolean supportsCapability(LearnerCapability lc)
              Overrides the method of the super class.
    private  boolean trainAdditionalModel(ExampleSet trainingSet, java.util.Vector<BayBoostBaseModelInfo> modelInfo)
               
    private  Model trainBaseModel(ExampleSet exampleSet)
              Runs the "embedded" learner on the example set and retuns a model.
     
    Methods inherited from class edu.udo.cs.yale.operator.learner.meta.AbstractMetaLearner
    applyInnerLearner, checkLearnerCapabilities, getEstimatedPerformance, getInnerOperatorCondition, getInputClasses, getInputDescription, getMaxNumberOfInnerOperators, getMinNumberOfInnerOperators, getWeights, shouldCalculateWeights, shouldEstimatePerformance, shouldReturnInnerOutput
     
    Methods inherited from class edu.udo.cs.yale.operator.OperatorChain
    addAddListener, addOperator, addOperator, checkDeprecations, checkIO, checkNumberOfInnerOperators, checkProperties, clearErrorList, clearStepCounter, cloneOperator, countStep, createExperimentTree, delete, experimentFinished, experimentStarts, getAllInnerOperators, getCurrentStep, getIndexOfOperator, getInnerOperatorForName, getInnerOperatorsXML, getNumberOfAllOperators, getNumberOfChildrensSteps, getNumberOfOperators, getOperator, getOperatorFromAll, getOperators, isEnabled, performAdditionalChecks, removeAddListener, removeOperator, setEnabled, setExperiment
     
    Methods inherited from class edu.udo.cs.yale.operator.Operator
    addError, addValue, addWarning, apply, createExperimentTree, createFromXML, createMarkedExperimentTree, getAddOnlyAdditionalOutput, getApplyCount, getDeliveredOutputClasses, getDeprecationInfo, getDesiredInputClasses, getErrorList, getExperiment, getInput, getInput, getInput, getIOContainerForInApplyLoopBreakpoint, getName, getOperatorClassName, getOperatorDescription, getParameter, getParameterAsBoolean, getParameterAsColor, getParameterAsDouble, getParameterAsFile, getParameterAsInt, getParameterAsString, getParameterList, getParameters, getParameterType, getParent, getStartTime, getStatus, getUserDescription, getValue, getValues, getXML, hasBreakpoint, hasBreakpoint, hasInput, inApplyLoop, isParameterSet, logMessage, register, remove, rename, resume, setBreakpoint, setInput, setListParameter, setOperatorParameters, setParameter, setParameters, setParent, setUserDescription, toString, writeXML
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
     
    Methods inherited from interface edu.udo.cs.yale.operator.learner.Learner
    getName
     

    Field Detail

    BATCH_SIZE

    public static final java.lang.String BATCH_SIZE
    Name of the variable specifying the maximal number of iterations of the learner.

    See Also:
    Constant Field Values


    EQUALLY_PROB_LABELS

    public static final java.lang.String EQUALLY_PROB_LABELS
    Boolean parameter to specify whether the label priors should be equally likely after first iteration.

    See Also:
    Constant Field Values


    HOLD_OUT_RATIO

    public static final java.lang.String HOLD_OUT_RATIO
    Parameter name to activate a hold out set for tuning.

    See Also:
    Constant Field Values


    MIN_ADVANTAGE

    public static final double MIN_ADVANTAGE
    Discard models with an advantage of less than the specified value.

    See Also:
    Constant Field Values


    STREAM_CONTROL_ATTRIB_NAME

    public static final java.lang.String STREAM_CONTROL_ATTRIB_NAME
    Name of the special attribute with additional stream control information.

    See Also:
    Constant Field Values


    MIN_LIFT_RATIO_SOFT_CLASSIFIER

    public static final double MIN_LIFT_RATIO_SOFT_CLASSIFIER
    The probabilistic prediction of soft classifiers is restricted, similar to a confidence bound. If the lift is close to 0 it is replaced by the minimum lift below. Analogously a maximum lift value is defined by (1 / MIN_LIFT_RATIO_SOFT_CLASSIFIER).

    See Also:
    Constant Field Values


    runVector

    private RunVector runVector

    currentIteration

    private int currentIteration

    performance

    private double performance

    oldWeights

    private double[] oldWeights
    Constructor Detail

    BayBoostStream

    public BayBoostStream(OperatorDescription description)
    Constructor.

    Method Detail

    supportsCapability

    public boolean supportsCapability(LearnerCapability lc)
    Overrides the method of the super class. Returns true for polynominal class.

    Specified by:
    supportsCapability in interface Learner
    Overrides:
    supportsCapability in class AbstractMetaLearner


    getNumberOfSteps

    public int getNumberOfSteps()
    Description copied from class: OperatorChain
    Returns the number of steps performed by this chain.

    Overrides:
    getNumberOfSteps in class AbstractMetaLearner
    See Also:
    OperatorChain.getNumberOfSteps()


    prepareWeights

    protected void prepareWeights(ExampleSet exampleSet)

    restoreOldWeights

    private void restoreOldWeights(ExampleSet exampleSet)

    learn

    public Model learn(ExampleSet exampleSet)
                throws OperatorException
    Constructs a Model repeatedly running a weak learner, reweighting the training example set accordingly, and combining the hypothesis using the available weighted performance values.

    Throws:
    OperatorException


    retrainLastWeight

    private BayBoostModel retrainLastWeight(BayBoostModel ensemble,
                                            ExampleSet exampleSet,
                                            java.util.Vector holdOutSet)
                                     throws OperatorException
    Throws:
    OperatorException

    apply

    public IOObject[] apply()
                     throws OperatorException
    Overwrite to also return the performance (run-) vector

    Overrides:
    apply in class AbstractMetaLearner
    Returns:
    the last inner operator's output or the input itself if the chain is empty.
    Throws:
    OperatorException


    getOutputClasses

    public java.lang.Class[] getOutputClasses()
    Overwrite to also return the performance (run-) vector

    Overrides:
    getOutputClasses in class AbstractMetaLearner


    rescalePriors

    private void rescalePriors(ExampleSet exampleSet,
                               double[] classPriors)
    Computes the weighted class priors of the boolean target attribute and shifts weights so that the priors are equal afterwards.


    trainBaseModel

    private Model trainBaseModel(ExampleSet exampleSet)
                          throws OperatorException
    Runs the "embedded" learner on the example set and retuns a model.

    Parameters:
    exampleSet - an ExampleSet to train a model for
    Returns:
    a Model
    Throws:
    OperatorException


    prepareBatch

    private double[] prepareBatch(int currentBatchNum,
                                  java.util.Iterator<Example> reader,
                                  Attribute batchAttribute)
                           throws UndefinedParameterError
    The preparation part collecting the examples of a batch, computing priors and resetting weights to 1.

    Parameters:
    currentBatchNum - the batch number to be assigned to the examples
    reader - the Iterator with the cursor on the current point in the stream.
    batchAttribute - the attribute to write the batch number to
    Returns:
    the class priors of the batch
    Throws:
    UndefinedParameterError


    prepareExtendedBatch

    private double[] prepareExtendedBatch(ExampleSet extendedBatch)
    Similar to prepareBatch, but for extended batches.

    Parameters:
    extendedBatch - containing the extended batch
    Returns:
    the class priors of the batch


    evaluatePredictions

    private double evaluatePredictions(ExampleSet exampleSet)
    returns the accuracy of the predictions for the given example set


    trainAdditionalModel

    private boolean trainAdditionalModel(ExampleSet trainingSet,
                                         java.util.Vector<BayBoostBaseModelInfo> modelInfo)
                                  throws OperatorException
    Throws:
    OperatorException

    adjustBaseModelWeights

    private boolean adjustBaseModelWeights(ExampleSet exampleSet,
                                           java.util.Vector<BayBoostBaseModelInfo> modelInfo)
                                    throws OperatorException
    This helper method takes as input the traing set and the set of models trained so far. It re-estimates the model weights one by one, which means that it changes the contents of the modelInfo container. Works with crisp base classifiers, only!

    Parameters:
    exampleSet - the training set to be used to tune the weights
    modelInfo - the Vector of Models, each with its biasMatrix
    Returns:
    true iff the ExampleSet contains at least one example that is not yet explained deterministically (otherwise: nothing left to learn)
    Throws:
    OperatorException


    isModelUseful

    private boolean isModelUseful(ContingencyMatrix cm)
    Helper method to decide whether a model improves the training error enough to be considered.

    Parameters:
    cm - the contingency matrix
    Returns:
    true iff the advantage is high enough to consider the model to be useful


    debugMessage

    private static void debugMessage(WeightedPerformanceMeasures wp)

    createOrReplacePredictedLabelFor

    private static void createOrReplacePredictedLabelFor(ExampleSet exampleSet,
                                                         Model model)
    Helper method replacing Model.createPredictedLabel(ExampleSet) in order to lower memory consumption.


    getParameterTypes

    public java.util.List<ParameterType> getParameterTypes()
    Adds the parameters "rescale label priors" and "weighted batch size".

    Overrides:
    getParameterTypes in class Operator



    Copyright © 2001-2006