The important property of separable DCPs is that their solutions can be constructed from solutions to their components:
Note that the parallel and serial cases are different. One would
expect the parallel case to be easier to solve because the policy can
perform actions on both state components simultaneously. In fact it
is more difficult because one is required to perform actions
on both simultaneously and this leaves the agent no way of preserving
one solved subproblem while solving another. Consider a ``flip-flop''
environment where flip(x)=1-x. F has
the property that every state is accessible from every other state.
also has this property.
, however, does
not.
has only one action, which flips both state
components at once. Thus only two states are accessible from any
given state in
: the state itself and its flip. As with
the king, the problem is fixed if we add the identity action to F.
Then it is possible to leave one component of the product intact,
while changing the other. The identity action, while sufficient, is
not necessary. A weaker, but still unnecessary, condition is that F
have some action that always maps goal states to goal states.