Making smaller information-feedback plans

The primary use of an I-map is to simplify the description of a plan. In Section 11.1.3, a plan was defined as a function on the history I-space, $ {\cal I}_{hist}$. Suppose that an I-map, $ {\kappa }$, is introduced that maps from $ {\cal I}_{hist}$ to $ {\cal I}_{der}$. A feedback plan on $ {\cal I}_{der}$ is defined as $ \pi : {\cal I}_{der}\rightarrow U$. To execute a plan defined on $ {\cal I}_{der}$, the derived I-state is computed at each stage $ k$ by applying $ {\kappa }$ to $ {\eta}_k$ to obtain $ {\kappa}({\eta}_k)
\in {\cal I}_{der}$. The action selected by $ \pi $ is $ \pi ({\kappa}({\eta}_k)) \in U$.

To understand the effect of using $ {\cal I}_{der}$ instead of $ {\cal I}_{hist}$ as the domain of $ \pi $, consider the set of possible plans that can be represented over $ {\cal I}_{der}$. Let $ {\Pi}_{hist}$ and $ {\Pi}_{der}$ be the sets of all plans over $ {\cal I}_{hist}$ and $ {\cal I}_{der}$, respectively. Any $ \pi \in {\Pi}_{der}$ can be converted into an equivalent plan, $ \pi ' \in {\Pi}_{hist}$, as follows: For each $ {\eta}\in {\cal I}_{hist}$, define $ \pi '({\eta}) = \pi ({\kappa}({\eta}))$.

It is not always possible, however, to construct a plan, $ \pi \in {\Pi}_{der}$, from some $ \pi ' \in {\cal I}_{hist}$. The problem is that there may exist some $ {\eta}_1,{\eta}_2 \in {\cal I}_{hist}$ for which $ \pi '({\eta}_1) \not = \pi '({\eta}_2)$ and $ {\kappa}({\eta}_1) =
{\kappa}({\eta}_2)$. In words, this means that the plan in $ {\Pi}_{hist}$ requires that two histories cause different actions, but in the derived I-space the histories cannot be distinguished. For a plan in $ {\Pi}_{der}$, both histories must yield the same action.

An I-map $ {\kappa }$ has the potential to collapse $ {\cal I}_{hist}$ down to a smaller I-space by inducing a partition of $ {\cal I}_{hist}$. For each $ {\eta_{der}}\in {\cal I}_{der}$, let the preimage $ {\kappa}^{-1}({\eta_{der}})$ be defined as

$\displaystyle {\kappa}^{-1}({\eta_{der}}) = \{ {\eta}\in {\cal I}_{hist}\;\vert\; {\eta_{der}}= {\kappa}({\eta}) \} .$ (11.26)

This yields the set of history I-states that map to $ {\eta_{der}}$. The induced partition can intuitively be considered as the ``resolution'' at which the history I-space is characterized. If the sets in (11.26) are large, then the I-space is substantially reduced. The goal is to select $ {\kappa }$ to make the sets in the partition as large as possible; however, one must be careful to avoid collapsing the I-space so much that the problem can no longer be solved.

Example 11..11 (State Estimation)   In this example, the I-map is the classical approach that is conveniently taken in numerous applications. Suppose that a technique has been developed that uses the history I-state $ {\eta}\in {\cal I}_{hist}$ to compute an estimate of the current state. In this case, the I-map is $ {\kappa}_{est}: {\cal I}_{hist}\rightarrow X$. The derived I-space happens to be $ X$ in this case! This means that a plan is specified as $ \pi : X
\rightarrow U$, which is just a state-feedback plan.

Consider the partition of $ {\cal I}_{hist}$ that is induced by $ {\kappa}_{est}$. For each $ x \in X$, the set $ {\kappa}_{est}^{-1}(x)$, as defined in (11.26), is the set of all histories that lead to the same state estimate. A plan on $ X$ can no longer distinguish between various histories that led to the same state estimate. One implication is that the ability to encode the amount of uncertainty in the state estimate has been lost. For example, it might be wise to make the action depend on the covariance in the estimate of $ x$; however, this is not possible because decisions are based only on the estimate itself. $ \blacksquare$

Example 11..12 (Stage Indices)   Consider an I-map, $ {{\kappa}_{stage}}$, that returns only the current stage index. Thus, $ {{\kappa}_{stage}}({\eta}_k) = k$. The derived I-space is the set of stages, which is $ {\mathbb{N}}$. A feedback plan on the derived I-space is specified as $ \pi : {\mathbb{N}}\rightarrow U$. This is equivalent to specifying a plan as an action sequence, $ (u_1,u_2,\ldots,)$, as in Section 2.3.2. Since the feedback is trivial, this is precisely the original case of planning without feedback, which is also refereed to as an open-loop plan. $ \blacksquare$

Steven M LaValle 2020-08-14