Our planning formulation uses regression search to find strong conformant plans and progression search to find strong conformant and conditional plans. A strong plan guarantees that after a finite number of actions executed from any of the many possible initial states, all resulting states are goal states. Conformant plans are a special case where the plan has no conditional plan branches, as in classical planning. Conditional plans are a more general case where plans are structured as a graph because they include conditional actions (i.e. the actions have causative and observational effects). In this presentation, we restrict conditional plans to DAGs, but there is no conceptual reason why they cannot be general graphs. Our plan quality metric is the maximum plan path length.
We formulate search in the space of belief states, a technique
described by [6]. The planning problem
is
defined as the tuple
, where
is a
domain description,
is the initial belief state, and
is the goal belief state (consisting of all states satisfying the
goal). The domain
is a tuple
, where
is a set of fluents and
is a set of actions.
Logical Formula Representation: We make extensive use of
logical formulas over
to represent belief states, actions, and
labels, so we first explain a few conventions. We refer to
every fluent in
as either a positive literal or a negative
literal, either of which is denoted by
. When discussing the
literal
, the opposite polarity literal is denoted
. Thus
if
=
at(location1), then
= at(location1). We
reserve the symbols
and
to denote logical false and
true, respectively. Throughout the paper we define the conjunction
of an empty set equivalent to
, and the disjunction of an
empty set as
.
Logical formulas are propositional sentences comprised of literals,
disjunction, conjunction, and negation. We refer to the set of
models of a formula
as
. We consider the
disjunctive normal form of a logical formula
,
,
and the conjunctive normal form of
,
. The DNF is seen
as a disjunction of ``constituents''
each of which is a
conjunction of literals. Alternatively the CNF is seen as a
conjunction of ``clauses''
each of which is a disjunction of
literals.1 We find it useful to think of DNF and CNF represented as
sets - a disjunctive set of constituents or a conjunctive set of
clauses. We also refer to the complete representation
of a
formula
as a DNF where every constituent - or in this case
state
- is a model of
.
Belief State Representation: A world state,
, is represented
as a complete interpretation over fluents. We also refer to states
as possible worlds. A belief state
is a set of states and is
symbolically represented as a propositional formula over
. A
state
is in the set of states represented by a belief state
if
, or equivalently
.
For pedagogical purposes, we use the bomb and toilet with clogging and sensing problem, BTCS, as a running example for this paper.2 BTCS is a problem that includes two packages, one of which contains a bomb, and there is also a toilet in which we can dunk packages to defuse potential bombs. The goal is to disarm the bomb and the only allowable actions are dunking a package in the toilet (DunkP1, DunkP2), flushing the toilet after it becomes clogged from dunking (Flush), and using a metal-detector to sense if a package contains the bomb (DetectMetal). The fluents encoding the problem denote that the bomb is armed (arm) or not, the bomb is in a package (inP1, inP2) or not, and that the toilet is clogged (clog) or not. We also consider a conformant variation on BTCS, called BTC, where there is no DetectMetal action.
The belief state representation of the BTCS initial condition, in clausal representation is:
= arm
clog
(inP1
inP2)
(
inP1
inP2),
and in constituent representation is:
= (arm
clog
inP1
inP2)
(arm
clog
inP1
inP2).
The goal of BTCS has the clausal and constituent representation:
=
=
arm.
However, the goal has the complete representation:
The last four states (disjuncts) in the complete representation are unreachable, but consistent with the goal description.
Action Representation: We represent actions as having both
causative and observational effects. All actions
are described
by a tuple
where
is an execution precondition,
is a set of
causative effects, and
is a set of observations. The
execution precondition,
, is a conjunction of literals
that must hold for the action to be executable. If an action is
executable, we apply the set of causative effects to find successor
states and then apply the observations to partition the successor
states into observational classes.
Each causative effect
is a conditional
effect of the form
, where
the antecedent
and consequent
are
both a conjunction of literals. We handle disjunction in
or a
by replicating the respective action or
effect with different conditions, so with out loss of generality we
assume conjunctive preconditions. However, we cannot split
disjunction in the effects. Disjunction in an effect amounts to
representing a set of non-deterministic outcomes. Hence we do not
allow disjunction in effects thereby restricting to deterministic
effects. By convention
is an unconditional effect,
which is equivalent to a conditional effect where
.
The only way to obtain observations is to execute an action with
observations. Each observation formula
is a
possible sensor reading. For example, an action
that observes
the truth values of two fluents
and
defines
.
This differs slightly from the conventional description of
observations in the conditional planning literature. Some works
(e.g., [28]) describe an observation as a list of
observable formulas, then define possible sensor readings as all
boolean combinations of the formulas. We directly define the
possible sensor readings, as illustrated by our example. We note
that our convention is helpful in problems where some boolean
combinations of observable formulas will never be sensor readings.
The causative and sensory actions for the example BTCS problem are:
DunkP1:
=
clog,
clog,
inP1
arm
,
DunkP2:
clog,
clog,
inP2
arm
,
Flush:
,
clog
, and
DetectMetal:
inP1,
inP1
.