The Tileworld domain involves a grid with tiles and holes, and the goal is to fill each hole with a tile. This goal can be achieved with a fill operator, which has two preconditions: the agent must be at the hole, and it must be holding a tile. In our encoding, an agent can hold up to four tiles at a time. The go operator is used to achieve the (sub)goal of being at a hole, while the pickup operator is used to achieve the (sub)goal of holding a tile. In the normal way, go has a precondition of being at some location, namely whatever location the agent will move from. Pickup has a precondition of being at the location of some tile. The problems in the Tileworld problem set differ from one another in the number of holes that the agent must fill: each problem adds another hole.
Figures gives the log-log plot for the various strategies
on the Tileworld problems, when the preconditions were entered in the
default order. Note that LCFR (S+OC+UC) is the strategy
highlighted. Three other strategies were almost indistinguishable
from LCFR (S+OC+UC), namely, LCFR (S+OC+.1UC+F), DUnf-Gen
(S+OC+UC) and DUnf-Gen(S+OC+.1UC+F). All the other strategies
performed worse. This can more easily be seen in Figure
,
which gives the aggregate performance for the leading strategies:
those that were able to solve all seven Tileworld problems. In fact,
these leading
strategies were able to solve the seven Tileworld problems
without generating more than 1800 nodes for any problem. In contrast,
the remaining strategies failed on at least one, and up to four, of
the seven problems, given the limit of 100,000 nodes generated.
What was originally surprising to us is that on the Tileworld problems, delaying separable threats actually seems to hurt performance. The strategies that did best were those like LCFR and DUnf-Gen that do not delay separable threats. LCFR-DSep, ZLIFO, DSep-LC, and DSep all generated larger search spaces, in contrast to what we would have predicted given the experiments on the basic problem set.
To understand this result, we looked in detail at the planning trace
for these problems. What that revealed is that for the Tileworld
domain, the early resolution of separable threats has an important
advantage: it imposes what turns out to be the correct temporal
ordering between the steps of going to up a tile (to pick it up), and
carrying it to a hole. Virtually all the strategies create subplans
like the one shown in Figure . The goals involve filling
holes, so the planners insert steps to go to and pick up a tile, and
to go to the hole. At this point, there are two separable threats:
(1) the effect of going to the hole, (¬at</I>(X)), threatens the link
between going to the tile and picking it up (at(Z)), and (2) the
effect of going to the tile, (¬at</I>(W)), threatens the link between
going to the hole and filling it (at(Y)). Both threats are
separable, because X and W will be unbound; the planner does not
yet know where it will be traveling from. But there is only one valid
temporal ordering that will resolve these threats: going to the tile
must precede picking up the tile, which in turn must precede going to
the hole. Once this temporal ordering is determined, further planning
goes smoothly.
In contrast, if this ordering decision is not made, the planner can often ``get lost'', attempting to find plans in which it goes from some location to the hole and then from the hole to the tile. There are many ways to attempt this, because there are many different tiles to select, and many different locations to move among. The planner may try many of these alternatives before determining that there is a fundamental inconsistency in these plans, and that they are destined to fail. The larger the number of holes to be filled, the worse the situation becomes.
Sometimes the planner may make the right decision about temporal
ordering even if it has deferred separable threats. When faced with
the partial plan in Figure , if the planner does not select
a threat, it will select from among several open conditions. It can
attempt to establish the precondition of going to the hole (at(X))
by reusing the effect of going to the tile (at(Z)), or it can do the
reverse, and attempt to establish the precondition of going to the
tile (at(W)) by reusing the effect of going to the hole (at(X)).
Of course, the first solution is the right one, and includes the
critical temporal ordering constraint, while the second will
eventually fail.
The order in which the open conditions are selected will determine
which of these two choices the planner makes. When preconditions are
entered in the default order, planners that delay separable threats
end up making the latter, problematic choice. In contrast, when the
preconditions are entered in the reverse order, the planners make what
turns out to be the correct choice. Thus, for the experiments in
which we reversed precondition insertion, we see a different pattern
of performance, as shown in Figures
-
.
When the preconditions are entered in the reverse order, a larger number of strategies perform well, solving all the problems. In particular, with S+OC+.1UC+F node-selection, the performance of LCFR, DUnf-Gen, ZLIFO, and LCFR-DSep is virtually indistinguishable. It is important to note that the leading strategies that do not delay separable threats--LCFR and DUnf-Gen--are not affected much by the reversal of precondition insertion for the Tileworld problems; in fact, LCFR's performance is identical in both cases. In contrast, the strategies that use separable-threat delay--LCFR-DSep, ZLIFO, and DSep-LC--all perform much better when we reverse precondition insertion. This is explained by our analysis above.
In sum, what is most important for the Tileworld domain is for the planner to recognize, as early as possible, that there are certain required temporal orderings between some of the steps in any successful plan. Every successful plan will involve going to a tile before going to a hole, although there is flexibility in the order in which multiple holes are visited, and in the interleaving of picking up tiles and dropping them in holes. For the strategies we studied, there were two different methods that led to this temporal constraint being added to the plan. It was added when the planner selected a separable threat to resolve, and it was added when it selected one particular precondition to resolve before another.