PLTLMIN

Unfortunately, the MDPs produced by PLTLSIM are far from minimal. Although they could be postprocessed for minimisation before invoking the MDP solution method, the above expansion may still constitute a serious bottleneck. Therefore, Bacchus et al. [2] consider a more complex two-phase translation, which we call PLTLMIN, capable of producing an MDP also satisfying property (3). Here, a preprocessing phase iterates over all states in $S$, and computes, for each state $s$, a set $l(s)$ of subformulae, where the function $l$ is the solution of the fixpoint equation $l(s) = F \cup \{\mbox{Reg}(\psi',s') \mid \psi' \in l(s'), \mbox{$s'$\ is a successor of $s$}\}$. Only subformulae in $\overline{l(s)}$ will be candidates for inclusion in the sets labelling the respective e-states labelled with $s$. That is, the subsequent expansion phase will be as above, but taking $\Psi_0 \subseteq \overline{l(s_0)}$ and $\psi' \subseteq \overline{l(s')}$ instead of $\Psi_0 \subseteq \overline{\mbox{Sub}(F)}$ and $\psi' \subseteq \overline{\mbox{Sub}(F)}$. As the subformulae in $l(s)$ are exactly those that are relevant to the way feasible execution sequences starting from e-states labelled with $s$ are rewarded, this leads the expansion phase to produce a minimal equivalent MDP. Figure 5 shows the equivalent MDP produced by PLTLMIN for the NMRDP example in Figure 3, together with the function $l$ from which the labels are built. Observe how this MDP is smaller than the PLTLSIM MDP: once we reach the state on the left-hand side in which $p$ is true and $q$ is false, there is no point in tracking the values of subformulae, because $q$ cannot become true and so the reward formula cannot either. This is reflected by the fact that $l(\{p\})$ only contains the reward formula. In the worst case, computing $l$ requires a space, and a number of iterations through $S$, exponential in $\vert\vert F\vert\vert$. Hence the question arises of whether the gain during the expansion phase is worth the extra complexity of the preprocessing phase. This is one of the questions our experimental analysis in Section 5 will try to answer.
Figure 4: Equivalent MDP Produced by PLTLSIM
$\textstyle \parbox{0.55\textwidth}{\includegraphics[height=0.4\textheight]{figures/pqpltlsim}}$$\textstyle \parbox{0.4\textwidth}{\footnotesize \par The following subformulae ... ...dash \circleddash p$\\ $f_{10}: \neg (q \wedge \circleddash \circleddash p)$ }$
Figure 5: Equivalent MDP Produced by PLTLMIN
$\textstyle \parbox{0.55\textwidth}{\includegraphics[height=0.4\textheight]{figures/pqpltlmin}}$$\textstyle \parbox{0.4\textwidth}{\footnotesize \par The function $l$\ is given... ...circleddash \circleddash p$)\\ $f_5: \neg\circleddash p$\\ $f_6: \neg p$\\ }$
Sylvie Thiebaux 2006-01-20