Continuous action spaces

Now suppose that $ U(x)$ is continuous, in addition to $ X$. Assume that $ U(x)$ is both a closed and bounded subset of $ {\mathbb{R}}^n$. Once again, the dynamic programming recurrence, (8.56), remains the same. The trouble now is that the $ \min$ represents an optimization problem over an uncountably infinite number of choices. One possibility is to employ nonlinear optimization techniques to select the optimal $ u \in U(x)$. The effectiveness of this depends heavily on $ U(x)$, $ X$, and the cost functional.

Another approach is to evaluate (8.56) over a finite set of samples drawn from $ U(x)$. Again, it is best to choose samples that reduce the dispersion as much as possible. In some contexts, it may be possible to eliminate some actions from consideration by carefully utilizing the properties of the cost-to-go function and its representation via interpolation.



Steven M LaValle 2020-08-14