Even assuming that
can be solved in unit time, a
literal interpretation of Equation 4 says we'll need to
solve this optimization problem for an exponential number of cost
vectors (or even more if the probability distributions
are
continuous). kearns99b kearns99b showed that values of partially
observable Markov decision processes could be estimated accurately by
sampling trajectories instead of exactly computing sums.
littman00 littman00 did the same for stochastic satisfiability
expressions. Applying this idea to Equation 4 leads
to the following algorithm.
A remaining challenge in evaluating Equation 6 is
computing the real-valued bid that maximizes the value. Note
that we want to buy item
precisely at those closing prices for
which the value of having the item (minus its cost) exceeds the value
of not having the item; this maximizes profit. Thus, to make a
positive profit, we are willing to pay up to, but not more than, the
difference in value of having the item and not having the item.
Formally,
let be the vector
of current holdings and
be the holdings modified to reflect
winning item
. Let
, the optimal
set of purchases assuming item
was won, and
the optimal set of purchases assuming otherwise
(except in cases of ambiguity, we write simply
and
for
and
respectively). We want to select
to
achieve the equivalence