edu.udo.cs.yale.operator.learner.meta
Class WeightedPerformanceMeasures

java.lang.Object
  extended by edu.udo.cs.yale.operator.learner.meta.WeightedPerformanceMeasures
Direct Known Subclasses:
AdaBoostPerformanceMeasures, SDReweightMeasures

public class WeightedPerformanceMeasures
extends java.lang.Object

This private class cares about weighted performance measures as used by the BayesianBoosting algorithm and the similarly working ModelBasedSampling operator.

Version:
$Id: WeightedPerformanceMeasures.java,v 1.33 2006/04/12 21:48:58 martin_scholz Exp $
Author:
Martin Scholz

Field Summary
private  double[] labels
           
private  double[][] pred_label
           
private  double[] predictions
           
static double RULE_DOES_NOT_APPLY
          This constant is used to express that no examples have been observed.
private  int[][] unweighted_num_pred_label
           
 
Constructor Summary
WeightedPerformanceMeasures(ExampleSet exampleSet)
          Constructor.
 
Method Summary
 double[][] createLiftRatioMatrix()
           
 ContingencyMatrix getContingencyMatrix()
          converts the deprecated representation into the new form
 int[] getCoveredExamplesNumForPred(int prediction)
          Method to query for the unweighted absolute number of covered examples of each class, given a specific prediction
 double[] getLabelPriors()
           
 double getLift(int label, int prediction)
          The lift of the rule specified by the nominal variable's indices.
 int getNumberOfLabels()
           
 int getNumberOfNonEmptyClasses()
           
 int getNumberOfPredictions()
           
 double[] getPnRatios(int prediction)
          The factor to be applied (pn-ratio) for each label if the model yields the specific prediction.
 double getProbability(int label, int prediction)
          Method to query for the probability of one of the prediction/label subsets
 double getProbabilityLabel(int label)
          Method to query for the "prior" probability of one of the labels.
 double getProbabilityPrediction(int premise)
          Method to query for the "prior" probability of one of the predictions.
static double reweightExamples(ExampleSet exampleSet, ContingencyMatrix cm, boolean allowMarginalSkews)
          Helper method of the BayesianBoosting operator This method reweights the example set with respect to the WeightedPerformanceMeasures object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

RULE_DOES_NOT_APPLY

public static final double RULE_DOES_NOT_APPLY
This constant is used to express that no examples have been observed.

See Also:
Constant Field Values


predictions

private double[] predictions

labels

private double[] labels

pred_label

private double[][] pred_label

unweighted_num_pred_label

private int[][] unweighted_num_pred_label
Constructor Detail

WeightedPerformanceMeasures

public WeightedPerformanceMeasures(ExampleSet exampleSet)
                            throws OperatorException
Constructor. Reads an example set, calculates its weighted performance values and caches them internally for later requests.

Parameters:
exampleSet - the ExampleSet this object shall hold the performance measures for
Throws:
OperatorException

Method Detail

getCoveredExamplesNumForPred

public int[] getCoveredExamplesNumForPred(int prediction)
Method to query for the unweighted absolute number of covered examples of each class, given a specific prediction

Parameters:
prediction - the value predicted by the model (internal index number)
Returns:
an int[] array with the number of examples of class i (internal index number) stored at index i.


getNumberOfLabels

public int getNumberOfLabels()
Returns:
the number of classes, namely different values of this object's example set's label attribute

getNumberOfPredictions

public int getNumberOfPredictions()
Returns:
number of predictions or nominal classes predicted by the embedded learner. Not necessarily the same as the number of class labels.

getProbability

public double getProbability(int label,
                             int prediction)
Method to query for the probability of one of the prediction/label subsets

Parameters:
label - the (correct) class label of the example as it comes from the internal index
prediction - the boolean value predicted by the model (premise) (internal index number)
Returns:
the joint probability of label and prediction


getProbabilityLabel

public double getProbabilityLabel(int label)
Method to query for the "prior" probability of one of the labels.

Parameters:
label - the nominal class label
Returns:
the probability of seeing an example with this label


getProbabilityPrediction

public double getProbabilityPrediction(int premise)
Method to query for the "prior" probability of one of the predictions.

Parameters:
premise - the prediction of a model
Returns:
the probability of drawing an example so that the model makes this prediction


getLift

public double getLift(int label,
                      int prediction)
The lift of the rule specified by the nominal variable's indices. RULE_DOES_NOT_APPLY is returned to indicate that no such example has ever been observed, Double.POSITIVE_INFINITY is returned if the class membership can deterministically be concluded from the prediction. Important: In the multi-class case some of the classes might not be observed at all when a specific rule applies, but still the rule does not necessarily have a deterministic part. In this case the remaining number of classes is considered to be the complete set of classes when calculating the default values and lifts! This does not affect the prediction of the most likely class label, because the classes not observed have a probability of one, the other estimates increase proportionally. However, to calculate probabilities it is necessary to normalize the estimates in the class BayBoostModel.

Parameters:
label - the true label
prediction - the predicted label
Returns:
the LIFT, which is a value >= 0, positive infinity if all examples with this prediction belong to that class (deterministic rule), or RULE_DOES_NOT_APPLY if no prediction can be made.


getPnRatios

public double[] getPnRatios(int prediction)
The factor to be applied (pn-ratio) for each label if the model yields the specific prediction.

Parameters:
prediction - the predicted class
Returns:
a double[] array containing one factor for each class. The result should either consist of well defined lifts >= 0, or all fields should mutually contain the constant RULE_DOES_NOT_APPLY.


createLiftRatioMatrix

public double[][] createLiftRatioMatrix()
Returns:
a matrix with one pn-factor per prediction/label combination, or the priors of predictions for the case of soft base classifiers.

getLabelPriors

public double[] getLabelPriors()
Returns:
a double[] with the prior probabilities of all class labels.

getNumberOfNonEmptyClasses

public int getNumberOfNonEmptyClasses()
Returns:
the number of classes with strictly positive weight

getContingencyMatrix

public ContingencyMatrix getContingencyMatrix()
converts the deprecated representation into the new form


reweightExamples

public static double reweightExamples(ExampleSet exampleSet,
                                      ContingencyMatrix cm,
                                      boolean allowMarginalSkews)
                               throws OperatorException
Helper method of the BayesianBoosting operator This method reweights the example set with respect to the WeightedPerformanceMeasures object. Please note that the weights will not be reset at any time, because they continuously change from one iteration to the next. This method does not change the priors of the classes.

Parameters:
exampleSet - ExampleSet to be reweighted
cm - the ContingencyMatrix as e.g. returned by WeightedPerformanceMeasures
allowMarginalSkews - indicates whether the weight of covered and uncovered subsets are allowed to change.
Returns:
the total weight
Throws:
OperatorException



Copyright © 2001-2006