The heuristics computed on the
can capture measures similar to
the
heuristics, but there exists a new opportunity to make use
of labels to improve heuristic computation efficiency. A single
planning graph is sufficient if there is no state aggregation being
measured, so we do not mention such measures for the
.
Positive Interaction Aggregation: Unlike
heuristics, we do
not compute positive interaction based relaxed plans on the
.
The
approach to measure positive interaction across each state
in a belief state is to compute multiple relaxed plans and take
their maximum value. To get the same measure on the
we would
still need to extract multiple relaxed plans, the situation we are
trying to avoid by using the
. While the graph construction
overhead may be lowered by using the
, the heuristic
computation could take too long. Hence, we do not compute relaxed
plans on the
to measure positive interaction alone, but we do
compute relaxed plans that measure overlap (which measures positive
interaction).
Independence Aggregation: Like positive interaction
aggregation, we need a relaxed plan for every state in the projected
belief state to find the summation of the costs. Hence, we do not
compute relaxed plans that assume independence.
State Overlap Aggregation: A relaxed plan extracted from the
to get the
heuristic resembles the unioned
relaxed plan in the
heuristic. Recall that the
heuristic extracts a relaxed plan from each of the
multiple planning graphs (one for each possible world) and unions
the set of actions chosen at each level in each of the relaxed
plans. The
relaxed plan heuristic is similar in that it counts
actions that have positive interaction in multiple worlds only once
and accounts for independent actions that are used in subsets of the
possible worlds. The advantage of
is that we find
these actions with a single pass on one planning graph.
We are trading the cost of computing multiple relaxed plans for
the cost of manipulating
labels to determine what lines of
causal support are used in what worlds. In the relaxed plan we
want to support the goal with every state in
, but in doing
so we need to track which states in
use which paths in the
planning graph. A subgoal may have several different (and possibly
overlapping) paths from the worlds in
.
A
relaxed plan is a set of layers:
, where
is
a set of actions,
is a set of effects, and
is a set of clauses. The elements of the
layers are labelled to indicate the worlds of
where they
are chosen for support. The relaxed plan is extracted from the
level
(i.e., the first level where
is reachable, also described in Appendix A).
Please note that we are extracting the relaxed plan for
in
terms of clauses, and not literals, which is different than the
and
versions of relaxed plans. Previously we found the
constituent of
that was first reached on a planning graph
and now we do not commit to any one constituent. Our rationale is
that we were possibly using different constituents in each of the
multiple graphs, and in this condensed version of the multiple
graphs we still want to be able to support different constituents
of the
in different worlds. We could also use the
constituent representation of
in defining the layers of the
relaxed plan, but choose the clausal representation of
instead because we know that we have to support each clause.
However with constituents we know we only need to support one (but
we don't need to know which one).
The relaxed plan, shown in bold in Figure 7, for
to reach
in CBTC is listed as follows:
![]() |
We start by forming
with the clauses in
, namely
arm and
clog; we label the
clauses with
because they need to be supported by all states
in our belief state. Next, we support each clause in
with the relevant effects from
to form
. For
clog we use persistence because it
supports
clog in all worlds described by
(this is an
example of positive interaction of worlds). For
arm the
relevant effects are the respective
from each Dunk
action. We choose both effects to support
arm because we need
to support
arm in all worlds of
, and each effect gives
support in only one world (this is an example of independence of
worlds). We then insert the actions associated with each chosen
effect into
with the appropriate label indicating
the worlds where it was needed, which in general is fewer
worlds than where it is reachable (i.e. it is always the case
that
). Next we form
with the execution preconditions of actions in
and antecedents of effects in
,
which are
clog, inP1, and inP2, labelled with all worlds where
an action or effect needed them. In the same fashion as level two,
we support the literals at level one, using persistence for inP1 and
inP2, and Flush for
clog. We stop here, because we have
supported all clauses at level one.
For the general case, extraction starts at the level
where
is first reachable from
. The first relaxed plan
layers we construct are
, where
contains all clauses
, labelled as
.
For each level
,
, we support each clause in
by choosing relevant effects from
to form
. An effect
is relevant if it is reachable in some of the worlds where we need
to support
(i.e.
) and the consequent gives a literal
. For each clause, we have to choose enough supporting
effects so that the chosen effect worlds are a superset of the
worlds we need to support the clause, formally:
We think of supporting a clause in a set of worlds as a set cover
problem where effects cover subsets of worlds. Our algorithm to
cover the worlds of a clause with worlds of effects is a variant of
the well known greedy algorithm for set cover [12]. We
first choose all relevant persistence effects that can cover worlds,
then choose action effects that cover the most new worlds. Each
effect we choose for support is added to
and
labelled with the new worlds it covered for
. Once all clauses in
are covered, we form the action layer
as all actions that have an effect in
. The actions in
are labelled
to indicate all worlds where any of their effects were labelled in
.
We obtain the next subgoal layer,
, by adding
literals from the execution preconditions of actions in
and antecedents of effects in
.
Each literal
is labelled to indicate all
worlds where any action or effect requires
. We support the
literals in
in the same fashion as
. We continue to support literals with effects, insert
actions, and insert action and effect preconditions until we have
supported all literals in
.
Once we get a relaxed plan, the relaxed plan heuristic,
, is the summation of the number of actions in
each action layer, formally:
Thus in our CBTC example we have
. Notice
that if we construct the
without mutexes for CBTC we reach the
goal after two layers. If we had included mutexes the
, then
it would reach the goal after three layers. The way we use mutexes
will not change our relaxed plan because we do not use mutexes to
influence relaxed plan extraction. Mutexes only help to identify
when a the belief state
is not reachable from
.